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(54) Title: COMBINED FLUORESCENCE AND REFLECTANCE SPECTROSCOPY 



(57) Abstract 

Methods and apparatus for performing fluores- 
cence spectroscopy on a sample. A sample is irradi- 
ated with a fluorescence excitation fiber (30) and radi- 
ation is collected from the sample with a fluorescence 
collection fiber (60) and detected to form fluores- 
cence spectra. The sample is also illuminated with a 
reflectance illumination fiber and reflected light from 
the sample is collected at a plurality of ^collection 
positions and detected to form spatially resolved re- 
flectance spectra. The fibers may form a probe ar- 
ranged in concentric sections. The spectra are ana- 
lyzed by preprocessing and reducing the dimension- 
ality of the spectral data. 
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DESCRIPTION 

COMBINED FLUORESCENCE AND REFLECTANCE SPECTROSCOPY 



BACKGROUND OF THE INVENTION 

5 L Field of the Invention 

The present invention relates generally to the fields of optical imaging. More 
particularly, it concerns apparatus and methods for combining fluorescence and 
reflectance spectroscopy for the imaging of samples, including both in situ and ex situ 
imagining of body tissues. 

2. Description of Related Art 

Cancer is one of the leading causes of death in the United States and in the 
world. In the United States alone, deaths frtftty cancer are estimated to number 
560.000 in 1997 (American Cancer Society Online, Cancer Facts & Figures). 

15 Currently, diagnosis and treatment of cancer" follow histopathologic evaluation of 
directed biopsies. However, the tissue removal necessitated by these techniques not 
only may alter the progression of the disease (Robbins and Kumar, 1984) but is also 
very costly. Improving the capability for in situ monitoring of disease progression 
could greatly enhance the ability to detect and treat cancer and precancer (Kelloff et 

20 a/., 1992). 

A growing number of clinical studies have demonstrated that fluorescence 
spectroscopy may be used to distinguish normal and abnormal human tissues in vivo 
in the skin, head and neck, genito-urinary tract, gastro-intestinal tract, breast, and 
brain. It is well known that fluorescence intensity and lineshape are a function of both 

25 the excitation and emission wavelength in samples containing multiple chromophores, 
such as human tissue. A complete characterization of the fluorescence properties of 
an unknown sample requires measurement of a fluorescence excitation emission 
matrix, in which the fluorescence intensity is recorded as a function of both excitation 
and emission wavelength. The field of analytical chemistry has exploited the 

30 fluorescence properties of different compounds; to identify and quantify them in 
mixtures. 
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Most clinical studies reported to date have measured fluorescence emission 
spectra at only a small number of excitation wavelengths (typically one to three) due 
to clinical requirements imposed on the size, speed and sensitivity of instrumentation. 
The choice of excitation wavelength has been based on factors which vary from study 

5 to study, but include laser availability and predictions of chromophores thought to be 
present in normal and abnormal tissues and measurements of fluorescence excitation 
emission matrices (EEM) of normal and abnormal tissues in vitro. While in vitro 
measurements of tissue EEMs are feasible using commercially available scanning 
fluorimeters, several studies have demonstrated; that the optical properties of tissue 

10 change significantly when tissue is examined in vitro due in part to interruption of the 
blood supply, oxidation and small size of biopsies. Thus, in vitro studies to select 
excitation wavelengths are of limited value. 

Several recent studies have suggested that differences in optical properties, 
assessed using diffuse reflectance spectroscopy, may be, used; tp discriminate normal 

15 and abnormal human tissues in vivo in the urinary bladder and the skin. Furthermore, 
measuring both fluorescence and diffuse reflectance spectra may provide additional 
information of diagnostic value. 

A system capable of measuring spatially . resolved reflectance spectra and 
fluorescence excitation emission matrices in vivo, would remove limitations of many 

20 previous studies, potentially enabling prediction of excitation wavelengths that 
provide greatest discrimination of normal and abnormal tissues, as well as a better 
understanding of the relative diagnostic ability: of. changes in absorption, scattering 
and fluorescence properties of tissue. Although fiber optic systems to record 
fluorescence EEMs and reflectance spectra at a single spatial location have been 

25 reported, such systems have measured data from only a single spatial location, and 
have thus not been able to perform spatially resolved spectroscopy. Additionally, 
previous systems have not been well-adapted for in-vivo studies of various tissues. 



30 



SUMMARY OF THE INVENTION 

In one respect, the invention is an apparatus for performing fluorescence and 
spatially resolved reflectance spectroscopy on a sample, and it includes a light source, 
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a monochromator, a reflectance illumination fiber, a fluorescence excitation fiber, an 
imaging spectrograph, a fluorescence collection fiber, a reflectance collection fiber, 
and a detector. The monochromator is in optical communication with the light source. 
The reflectance illumination fiber is in optical communication with the light source. 
The fluorescence excitation fiber is in optical communication with the 
monochromator. The fluorescence collection fiber is in optical communication with 
the imaging spectrograph. The reflectance collection fiber is in optical 
communication with the imaging spectrograph; and is in spaced relation with the 
reflectance illumination fiber. The detector is, in optical communication with the 
imaging spectrograph. 

In other aspects, the light source may include a Xe arc lamp. The 
monochromator may include a double monochromator. The detector comprises a 
thermo-electrically cooled CCD camera. The fluorescence excitation fiber and the 
fluorescence collection fiber may be integral ; One or more of the fibers may be 
positioned flush with the sample. The apparatus may also include a spacer positioned 
between one or more of the fibers and the sample. The reflectance illumination fiber, 
the fluorescence excitation fiber, the fluorescence collection fiber, and the reflectance 
collection fiber may define a fiber optic probe. The probe may be configured to be 
positioned within a trocar. The probe may include a center section and an outer 
section, and the fluorescence excitation fiber and the fluorescence collection fiber may 
be positioned in the center section, and me reflectance illumination fiber and the 
reflectance collection fiber may be positioned in ,thq outer section. The apparatus may 
include a plurality of fluorescence excitation and collection fibers arranged in a 
circular bundle. The apparatus may include a plurality of reflectance collection fibers 
defining a plurality of collection positions. The. plurality of collection positions may 
be spaced between about 0 and about 10 millimeters from the reflectance illumination 
fiber. The reflectance collection fiber may define a collection position at about 180 
degrees relative to the reflectance illumination fiber. The reflectance collection fiber 
may define a collection position at about 90 degrees relative to the reflectance 
iUumination fiber. The reflectance collection fiber may define a collection position at 
about 45 degrees relative to the reflectance illumination fiber. The apparatus may 
include one or more fibers in optical communication with the light source and 
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configured to illuminate the sample during operation of the apparatus. The apparatus 
may include a plurality of fluorescence excitation fibers arranged in one or more rows 
adjacent the monochromator. The apparatus may include a plurality of fluorescence 
excitation fibers and a plurality of reflectance collection fibers arranged in a single 
row adjacent the imaging spectrograph. The apparatus may include one or more 
unconnected fibers interspersed with the plurality of fluorescence excitation fibers and 
the plurality of reflectance collection fibers. The apparatus may include a fiber 
connected from the light source to the imaging spectrograph to monitor spectral 
output of the light source. The apparatus may include a controller coupled to the 
detector. 

In another respect, the invention is an apparatus for measuring fluorescence 
and spatially resolved reflectance spectra of a sample. The apparatus includes a light 
source, a monochromator, a fiber optic probe* an imaging spectrograph, and a 
detector. The monochromator is in optical communication with the light source. The 
fiber optic probe is in optical communication with the light source and with the 
monochromator. The probe includes a plurality; of fluorescence excitation and 
collection fibers in spaced relation and a plurality of reflectance collection fibers in 
spaced relation with a reflectance illumination fiber. The imaging spectrograph is in 
optical communication with the plurality of fluorescence collection fibers and with the 
plurality of reflectance collection fibers. The detector is in optical communication 
with the imaging spectrograph. 

In other aspects, the plurality of reflectance collection fibers and the 
reflectance illumination fiber may be positioned concentrically about the plurality of 
fluorescence excitation and collection fibers. . At least one of the plurality of 
reflectance collection fibers may define a collection position at about 180 degrees 
relative to the reflectance Ulumination fiber. At least one of the plurality of 
reflectance collection fibers may define a collection position at about 90 degrees 
relative to the reflectance illumination fiber.: At least one of the plurality of 
reflectance collection fibers may define a collection position at about 45 degrees 
relative to the reflectance illumination fiber. The plurality of collection positions may 
be spaced between about 0 and about 10 millimeters from the reflectance ulumination 
fiber. The probe may include between twenty-one and forty-six optical fibers. 
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In another respect, the invention is a method for combined fluorescence and 
spatially resolved reflectance spectroscopy of a sample. The method includes 
directing radiation to the sample with a fluorescence excitation fiber, collecting 
radiation from the sample with a fluorescence collection fiber, directing the radiation 
5 from the sample to an imaging spectrograph and a detector, illuminating the sample 
with a reflectance illumination fiber, collecting reflected light from the sample with a 
reflectance collection fiber in spaced relation, with the reflectance illumination fiber, 
and directing the reflected light from the sample to an imaging spectrograph and a 
detector. 

10 In other aspects, the step of collecting reflected light may include collecting 

reflected light from a plurality of collection positions with a plurality of reflectance 
collection fibers. The step of collecting reflected light may include collecting 
reflected light from the sample with a reflectance collection fiber defining a collection 
position at about 180 degrees relative to the reflectance illumination fiber. The step of 

15 collecting reflected light may include collecting reflected light from the sample with a 
reflectance collection fiber defining a collection position at about 90 degrees relative 
to the reflectance illumination fiber. The step of collecting reflected light may include 
collecting reflected light from the sample with a reflectance collection fiber defining a • 
collection position at about 45 degrees relative to the reflectance illumination fiber. 

20 The sample may include ovarian, head and neck, or cervical tissue. The method may 
also include analyzing spectral data from the detector to characterize the sample. The 
step of analyzing may include pre-processing the data and reducing a dimension of the 
data using principal component analysis. The step of analyzing may also include 
selecting one or more diagnostic principal components of the data and forming one or 

25 more algorithms. The step of analyzing may also include forming one or more 
composite algorithms. The step of analyzing may also include evaluating at least on 
of the algorithms using a cross-validation technique. 

In another respect, the invention is a method for combined fluorescence and 
spatially resolved reflectance spectroscopy of a sample. The method includes 

30 directing radiation to the sample with a fluorescence excitation fiber, collecting 
radiation from the sample with a fluorescence collection fiber, directing the radiation 
from the sample to an imaging spectrograph and a detector, illuminating the sample 



WO 99/57529 PCT/US99/09768 

6 

with a reflectance illumination fiber, collecting reflected light at a plurality of 
collection positions from the sample with a plurality of reflectance collection fibers 
arranged in spaced relation, directing the reflected light from the sample to an imaging 
spectrograph and a detector to produce spectral data, pre-processing the data, and 
reducing a dimension of the data using principal component analysis. 

The method may also include selecting one or more diagnostic principal 
components of the data and forming one or more algorithms. The method may also 
include forming one or more composite dgonmms. The method may also include 
evaluating at least one of the algorithms using a cross-validation technique. 

In another respect, the invention is a method for analyzing spectroscopy data to 
define an optimized reduced data set. The method includes pre-processing the 
spectroscopy data, reducing a dimension of the spectroscopy data using principal 
component analysis, and selecting one or more diagnostic principal components of the 
spectroscopy data. 

In other aspects, the spectroscopy data may include combined fluorescence and 
spatially resolved reflectance spectroscopy data, : The step of pre-processing may 
include normalization of the spectroscopy data, The step of pre-processing may 
include mean scaling the spectroscopy data. The . step of pre-processing may include 
calculating one or more derivatives on the spectroscopy data. The method may also 
include eliminating redundant data from the spectroscopy data. The method may also 
include forming one or more algorithms and evaluating* least one of the algorithms 
using a cross validation technique. The method may also include forming one or more 
composite algorithms. 

Applications for the methods and apparatus described herein are vast and 
include, but are not limited to, analysis and detection of disease including cancers and 
pre-cancers (such as cervical, head and neck, colon, lung, esophageal, ovarian) and 
atherosclerosis. Applications also include industry, including, but not limited to, the 
semiconductor industry. 
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BRIEF DESCRIPTION OF THE DRA WINGS 

The following drawings form part of the present specification and are included 
to further demonstrate certain aspects of the present invention. The invention may be 
better understood by reference to one or more of these drawings in combination with 
the detailed description of specific embodiments presented herein. 

FIG. 1 Block diagram of a Fast EEM system according to one embodiment of 
the present disclosure. 

FIGS. 2A and 2B Probe output at 332 nm according to one embodiment of 
the present disclosure. 

FIG. 3 Inside of a light source according to one embodiment of the present 
disclosure. 

FIG. 4 Outside connectors of the light source according to one embodiment 
of the present disclosure. 

FIGs. 5 A and 5B Comparison between the monochromator and the spectral 
lamp output 

FIGs. 6 A and 6B A probe according to the present disclosure showing 
fluorescence excitation fibers, fluorescence collection fibers, a quartz rod, a 
reflectance excitation fiber, and reflectance collection fibers. 

FIG. 7 Probe according to the present disclosure showing fluorescence fibers, 
a quartz rod, reflectance fibers, illumination fibers, a protection shield, and a quartz 
shield. . r . : 

FIGS. 8A and 8C Tip of a probe according to the present disclosure showing 
illumination of i) reflectance ii) fluorescence and iii) illumination fibers, 

FIGS. 9A and 9B Monochromator and spectrograph connector with 
fluorescence and reflectance collection fibers according to one embodiment of the 
present disclosure. 

FIG. 10 Probe including fiber connectors according to one embodiment of the 
present disclosure. Shown are visual illumination fiber 113, reflectance excitation 
fiber 1 15, fluorescence excitation fiber 1 17, and reflectance collection position 1 19. 

FIG. 11 Correction factors for the spectrograph. 
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FIG. 12 Schematic of Binning techniques: On chip binning (left), On chip 
and software binning (right). 

FIG. 13 Main screen of a Fast-EEM user interface according to one 
embodiment of the present disclosure. 
5 FIG. 14 System block diagram showing a variable excitation light source, a . 

fiber optic delivery and collection probe, and a spectral multichannel analyzer 
according to one embodiment of the present disclosure. 

FIGS. 15A - 15D (left) Schematic diagram of the distal ends of the probe: 
[a] outer shaft, [b] fluorescence excitation and emission fibers, [c] reflectance 
10 collection and illumination fibers, [d] mixing element, [E] reflectance excitation fiber, 
[1-3] reflectance collection locations, (Right) Schematic diagram of the proximal ends 
of the probe. 

FIG. 16A Simulated EEM with peak shifting in [1] excitation wavelength [2] 
and emission wavelength. 

15 FIGS. 16B-1 to 16B-6 Simulated EEM with peak shifting in [1] excitation 

wavelength [2] and emission wavelength. Calculated x av and ma V for the simulated 
EEM. x av is sensitive to changes in the excitation position of the peak and mav is 
sensitive to the emission position. 

FIG. 17A EEM of Rhodamine standard solution. 

20 FIG. 17B EEM of an FAD and microspheres-based tissue phantom measured 

using a FastEEM system. 

FIGS. 18A - 18D (A) Emission spectra at 360 nm excitation of the 
Rhodamine calibration standard measured with the FastEEM system and SPEX 
Fluorolog II fluorimeter. (B) Emission spectra at 360 nm excitation of the scattering 

25 tissue phanthom containing FAD and polystyrene microspheres measured with the 
FastEEM system and SPEX Fluorolog D fluorimeter. (Q Emission spectra at 450 nm 
excitation of the Rhodamine calibration standard measured with the FastEEM system 
and SPEX Fluorolog II fluorimeter. (D) Emission spectra at 450 nm excitation of the 
scattering tissue phanthom containing FAD and polystyrene microspheres measured 

30 with the FastEEM system and SPEX Fluorolog H fluorimeter. 

FIGS. 19A and 19B In-vivo fluorescence measurements with the FastEEM 
system: (A) Fluorescence EEM of a normal site of the tongue. (B) Fluorescence EEM 
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of a diseased site of the tongue, containing a moderately differentiated squamous cell 
carcinoma. 

FIGS. 20A - 20C Fluorescence emission spectra of normal and moderately 
differentiated squamous cell carcinoma of the tongue from Figure 6. The spectra were 
normalized to the peak fluorescence at 350 nm excitation, (a) Fluorescence emission 
spectra at 350 nm excitation, (b) Fluorescence emission spectra at 410 nm excitation, 
(c) Fluorescence emission spectra at 460 nm excitation. 

FIGS. 21A and 21B Emission and excitation autocorrelation vectors of 
normal and moderately differentiated squamous cell carcinoma of the tongue from 
FIGS. 18. (A) Emission autocorrelation vectors. (B) Excitation autocorrelation vectors. 

FIGS. 22A - 22C. Reflectance measurements of normal and moderately 
differentiated squamous cell carcinoma of the tongue at three different separations 
from the source fiber. (A) Position 1, 1.1 mm separation. (B) Position 2, 2.1 mm 
separation. (C) Position 3, 3 mm separation. 

FIGs. 23A - 23C A schematic of the portable fluorimeter used to measure 
cervical tissue fluorescence spectra at three excitation wavelengths. 

FIG. 24 A schematic of formal analytical process used to develop the 
screening and diagnostic algorithms. The text in the dashed-line boxes represent 
mathematical steps implemented on the spectral data and the text in the solid line 
boxes represent outputs after each mathematical step (NS - normal squamous, NC - 
normal columnar, LG - LG SIL and HG - HG SIL).. 

FIGS. 25A - 25C (a) Original and corresponding (b) normalized and (c) 
normalized, mean-scaled spectra at 337 nm excitation from a typical patient. 

FIGS. 26A - 26C (a) Original and corresponding (b) normalized and (c) 
normalized, mean-scaled spectra at 380 nm excitation from the same patient 

FIGS. 27A 27C (a) Original and corresponding (b) normalized and (c) 
normalized, mean-scaled spectra at 460 nm excitation from the same patient. 

FIG. 28 A plot of the posterior probability of belonging to the SIL category of 
all SILs and normal squamous epithelia from the calibration set Evaluation of the 
misclassified SILs indicates that one samples with GIN m, two with CIN H two with 
CIN I and two with HPV are incorrectly classified. . 
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FIG. 29 A plot of the posterior probability of belonging to the SIL category of 
all SILs and normal columnar epithelia from the calibration data set. Evaluation of the 
misclassified SELs indicates that three samples with CIN II, three with CIN I and one 
with HPV are incorrectly classified. 
5 FIG. 30 A plot of the posterior probability of belonging to the HG SIL 

category of all SILs from the calibration set Evaluation of the misclassified HG SILs 
indicates that three samples with CIN III and three with CIN are incorrectly classified 
as LG SILs; five samples with CIN I and two with HPV are misclassified as HG SDL. 
FIGS. 31A - 31C Component loadings (CL) of diagnostic principal 
10 components of constituent algorithm (1), obtained from normalized spectra at (a) 337 
(b) 380 and (c) 460 hm excitation, respectively. 

FIGS. 32A - 32C Component loadings (CL) of diagnostic principal 
components of constituent algorithm (2), obtained from normalized, mean-scaled 
spectra at (a) 337 (b) 380 and (c) 460 nm excitation, respectively. 
15 FIGS. 33A - 33C Component loadings (CL) of diagnostic principal 

components of constituent algorithm (3), obtained from normalized spectra at (a) 337 
(b) 380 and (c) 460 nm excitation, respectively. 

FIGS. 34A - 34D Plots of Frequency of occurrence vs. emission wavelength 
in top 25 performing combinations of three wavelengths: (a) ESL=65%, (b) 
20 ESL=75%,(c)ESL=85%,and(d)ESL^=95% • 

FIG. 3.5 Fluorescence emission spectra normalized by the peak intensity of 
the concatenated vector for all 62 sites at 350, 380 and 400 nm excitation. Red lines 
indicate histologically cancerous, green lines indicate histologically dysplastic, and 
blue lines indicate visually and/or histologically normal sites. 
25 FIG. 36 Plot of the only eigenvector of diagnostic importance at ESL = 65% 

for wavelength combination (350 380 400) (lower line at vector index=200) and the 
corresponding component loading (upper line at vector index=200). 

FIG. 37 Plot of emission vector for a wavelength combination of three 
excitation wavelengths (350, 380, 400 nm) normalized by the peak intensity of each 
30 emission spectra. 

FIGS. 38A - 38C Reflectance spectra (A), first (B) and second derivation (C) 
for position one. 
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FIGS. 39A - 39C Reflectance spectra (top), first (middle) and second 
derivation (bottom) for position two. 

FIGS. 40A - 40C Reflectance spectra (top), first (middle) and second 
derivation (bottom) for position three. 

FIGS. 41A - 41C Average reflectance spectra (top), first (middle) and second 
derivation (bottom) for position one. Error bars show standard deviation. 

FIGS. 42A - 42C Average reflectance spectra (top), first (middle) and second 
derivation (bottom) for position two. Error bars show standard deviation. 

FIGS. 43A - 43C Average reflectance spectra (top), first (middle) and second 
derivation (bottom) for position three. Error bars show standard deviation. 

FIGS. 44A - 44C p values comparing the mean intensity, mean first and 
second derivatives of normal tissue versus abnormal tissues, at source detector 
separation 1 (top), 2 (middle) and 3 (bottom). 

FIGS. 45A - 45C p values comparing , the mean intensity, mean first and 
second derivatives of normal tissue versus dysplastic tissues, at source detector 
separation 1 (top), 2 (middle) and 3 (bottom). 

FIG. 46 Scatter plot of the second derivative at 430 nm for position 2 vs. the 
second derivative at 495 nm for position one. The straight line represents an algorithm 
to separate normal findings from dysplasias and cancers, and results in a sensitivity of 
80% and a specificity of 85%. 

FIG. 47 Scatter plot of the second derivative at 45.0 nm for position 1 vs. the 
first derivative at 510 nm for position three. The straight line represents an algorithm 
to separate normal findings from dysplasias and cancers, and results in a sensitivity of 
80% and a specificity of 82%. 

FIG. 48 Scatter plot of the second derivative at 410 nm for position 1 vs. the 
first derivative at 510 nm for position three. The straight line represents an algorithm 
to separate normal findings from dysplasias and cancers, and results in a sensitivity of 
70% and a specificity of 75%. 



DESCRIPTION OF ILLUSTRATIVE RMRnnTMffNTTg 

FIG. 1 shows one embodiment of an apparatus 10 according to the present 
disclosure. The apparatus is adapted to measure both reflectance and fluorescence 



WO 99/57529 PCT/US99/09768 

12 

data, and may be referred to as a Fast-EEM system (where EEM stands for excitation 
emission matrix) system. Fast EEM system 10, in one embodiment, may include four 
main components, although those having skill in the art will recognize that more or 
fewer components may be utlized: The compbiiehts are: (a) an excitation source 20, 
which may include an arc lamp 22 and a monochromator 24 for monochromatic and 
broad band excitation, (b) a fiber optic probe 30, which may be configured to deliver 
excitation light to and collect remitted fluorescence from a sample 60, (c) a detection 
apparatus 40, which may include a filter wheels in imaging spectrograph 42, and a 
CCD camera 44 and that spectrally resolves a Collected signal, and (d) a control unit 
50, which may be a personal computer used to run Fast EEM system 10 and to acquire 
data. 

Excitation source 20 

The light source 22 for Fast EEM system 10, which may provide both quasi- 
monochromatic excitation for fluorescence and broad band illumination for 
reflectance, may be, in one embodiment, a 150 W ozone free Xe arc lamp (Spectral 
Energy Corp., Westwood NJ) with a spherical rear -.reflector. 

A condenser system including two piano convex quartz lenses may be used to 
couple light into a monochromator 24. With the benefit of the present disclosure, 
those having skill in the art will understand that any optical filter or device suitable for 
creating bandpass filtered light may be used for monochromator 24. In one 
embodiment, monochromator 24 may be a single monochromator. A manual shutter 
(not shown) may be located between condensing optics and monochromator 24 and 
may be closed to prevent fluorescence excitation light from reaching sample 60 during 
reflectance measurements. The scanning speed of monochromator 24 may be, in one 
embodiment, about 10 nm/sec. Light may be coupled from the output slit of 
monochromator 24 into probe 30 via a fiber optic adapter (Spectral Energy, GMA 
257) (not shown) that includes a quartz plano-convex lens and a 5X quartz 
microscope objective. The light passing through the objective may be focused to an 
appropriate shape to fill one or more fibers of. probe 30. In one embodiment, light 
passing through the objective may be focused onto a vertical line onto twenty-five 
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fibers of probe 30, the twenty-five fibers being arranged in two columns and placed at 
the focal plane of the objective (See FIG. 9A). 

A reflectance excitation fiber (See, e.g., FIG. 6) may be coupled to the lamp 
housing of light source 22 via a micropositioner (not shown). Broadband light exiting 
the lamp housing through an exiting hole may be coupled to a reflectance illumination 
fiber using a quartz plano-convex lens (NA=0.24). A five position illumination filter 
wheel (not shown) placed between the lamp and the lens may include three long pass 
filters with 50% transmission at 295 nm, 515 nm and 715 nm, respectively. One of 
the filter positions may be blocked and may act as a shutter to prevent white light from 
reaching sample 60 during fluorescence measurements. 

In another embodiment, the light source 22 for Fast EEM system 10, which 
may provide both quasi-monochromatic excitation for fluorescence and broad band 
illumination for reflectance, may be an ozone-free 450 W Xe arc lamp (FL-1007, 
Instruments S A, Edison, NJ). 

Light used for monochromatic fluorescence excitation may be focused with a 
spherical mirror (not shown) onto the input slit of monochromator 24. In this 
embodiment, monochromator 24 may be a cfpuble monochromator (DDD 180, 
Instruments SA, Edison, NJ). A spherical rear reflector foot shown) may redirect light 
that is exiting the lamp in the opposite direction into the opposite direction onto the 
Spherical mirror. The slit may be covered y/ith a sapphire window, which may 
prevent hot air from flowing out of the lamp housing into the monochromator 24. A 
double monochromator may be chosen for monochromator 24 because of its higher 
stray light rejection compared to a single monochromator. A double monochromator 
may be configured in additive mode, which means that the dispersions of the two 
holographic gratings are added. Stray light in such a configuration may be so slight as 
to be negligible. The focal length of each of the two monochromators may be about 
18 cm and the high throughput may be f/3.9. The two holographic gratings may have 
about 1200 grooves/mm and may be blazed at 500nm. In this embodiment, the 
system's maximal resolution may be about 0.3 nm with an accuracy of about 0.5 nm. 
The scanning speed in this emboidiment may be about 150nm/s, and the usable 
wavelength range may be from about 300 to about 1000 nm. Wavelength scanning 
may be achieved with a direct digital stepper-motor with a worm drive mechanism 
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(not shown). Three computer-controlled slits (entrance, middle, and exit) may be 
opened between 0 and 7 mm in steps of 12.5 um. In one embodiment, a slit-width of 
about 2 mm may be chosen for both the entrance and the exit slits. The middle slit 
twice may be opened as wide as the entrance and the exit slit to achieve an optimal 
performance. These settings guaranteed a spectral resolution of about 6 nm FWHM. 

FIG. 2 shows a spectrum taken at 332 nm by coupling light through probe 30 
through a fiber optic adapter into a scanning spectrofluorimeter (SPEX, Fluorolog n, 
Edison, NJ). An emission scan from 300 nm to 600 nm was performed to collect the 
relative intensity of the probe output. 

In one embodiment, the coupling of light into a fluorescence excitation bundle 
(See. e.g., FIG. 6 and FIG. 7) was done using a fiber-optic interface kit (220F, 
Instruments SA, Edison, NJ). Two plano-convex lenses (different focal lengths) may 
be matched to different NAs of the exit slit and of a fiber bundle of probe 30 to 
minimize coupling losses. A computer-controlled s shutter (LS6, Vincent Associates, 
Rochester, NY) may be mounted in front of the probe connector to block fluorescence 
excitation light during reflectance measurements. ... 

Light source 22 may be customized to prpyide white light output. White light 
may be needed (a) for reflectance measurements,; (b) . for visual observation of a 
measurement site by a physician, and (c) to monitor .the lamp output. 

FIG. 3 shows a top view drawing of the inside of the lamp housing according 
to one embodiment. Ught bulb 25 and ray, traces (dashed lines) for the 
monochromator light are shown. In one embodiment, the optimal solution to provide 
white light output to the outside of the housing involved the use a bundle of quartz 
fibers. One biconvex lens, mounted in a custom-made rack inside the lamp housing, 
coupled light into a bundle of three 600 urn and one 50 urn high-temperature quartz 
fibers (Thermocoat, Fiberguide Industries, Stirling, NJ). The Ught rays are indicated 
by the dotted line in FIG. 3. These fibers transported white light to four connectors on 
the outside of the housing (See FIG. 4). The first connector CI may provide 
excitation light used for reflectance measurements. The five-position illumination 
filter wheel described previously may be placed between two biconvex quartz lenses 
(focal length = 20 mm). The second connector G2 may be equipped with one quartz 
lens (focal length = 20 mm) that focuses light onto the illumination fiber bundle. A 
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second shutter (LS6, Vincent Associates, Rochester, NY) may be placed between the 
connector and the lens, which may be closed during data acquisition and may 
otherwise be held open to deliver light to the illumination fibers of probe 30. The 
third 600 pm fiber output C3 may be used for other purposes, or not at all. The 50 ym 
5 fiber output C4 may couple light into a fiber that is directly connected to imaging 
spectrograph 42 to record the lamp spectrum for every measurement. In one 
embodiment, however, this option is not used. 

FIG. 5 illustrates the power output of two monochromatic illumination 
systems (one using a 150 W ozone free Xe arc lamp and the other using an ozone-free 

10 450 W Xe arc lamp). The output was measured through probe 30 using a calibrated 
power meter (818-UV, Newport, Irvine, CA) and represents the flux (W) that is 
provided to sample 60, which may be a tissue sample. Above about 400 nm, an 
improvement in power of a factor of four is noticeable. Note that the lamp performed 
poorly below 400 nm. The light output at about 330 nm is only about 20% of the peak 

15 performance at 460 nm. The low UV output may be due to the fact that lamp is an 
ozone-free model. The light bulb is made out of UV blocking glass since Ozone is 
mainly produced in the surrounding air within this spectral region. In order to have a 
useful S/N ratio prolonged exposure times in the spectral region below 400 nm may 
become a necessity. 

20 Probe 30 

The combined spatial reflectance and fluorescence probe 30 of the present 
disclosure may be built to meet the following criteria. First, the tissue volume probed 
by the reflectance and fluorescence measurements may overlap. Second, because the 
collected fluorescence intensity may be typically three orders of magnitude lower than 

25 the reflectance intensity, a detector with a high dynamic range may be required. 
Weakening the reflectance excitation light by using a smaller excitation fiber or using 
a number of fluorescence excitation fibers may, however, alleviate this problem. 
Third, the total diameter of the probe may be small enough so that it is possible to 
cover an area of only one tissue type; for example, dysplastic lesions around a tumor 

30 are likely to be only a few millimeters wide. Finally, a probe 30 small in diameter 
may give the opportunity to use it for minimal invasive surgeries through trocars. 
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According to one embodiment, probe 30 may fit into a trocar. In one embodiment, it 
is designed to fit into a trocar (Reflex STR, 5 mm, Richard-Allan Inc.) that is 
commonly used in the Gynecology Department at The University of Texas M. D. 
Anderson Cancer Center, Houston, TX, (UT MDACC). 

One embodiment of a combined reflectance and fluorescence probe 30 
includes a total of 21 quartz fibers (200 urn core diameter, NA = 0.22). With the 
benefit of the present disclosure, however, those of skill in the art will recognize that 
more or fewer fibers may be used. AdditonaUy, although the present disclosure refers 
to embodiments of a probe including "fibers", it : will be understood that any channel 
suitable for transmission of light may be substituted therewith. In one embodiment, a 
ring of twelve fluorescence collection fibers 70 surround a circle of seven 
fluorescence excitation fibers 72. In one embodiment (not shown), at least one 
fluorescence fiber may be an integral fluorescence excitation and collection fiber. At 
the distal end of fluorescence excitation and collection fibers may be a quartz rod 
(about 1.5 mm diameter, about 7 mm thick) 74 located to ensure an overlap at the 
sample surface between fluorescence excitation and collection fibers. One reflectance 
excitation fiber 76 and one reflectance collection fiber 78 (both about 90 um core 
diameter) may be placed outside of the quartz rod and flush to the sample, which may 
be tissue, on opposite sides. The reflectance fibers may be about 1.7 mm apart from 
each other, and light may be scattered through, the same tissue volume that is 
examined for .fluorescence. 

In one embodiment, a probe 30 may have a total length of about 28 cm to 
about 35 cm, which allows the probe to pass a trocar shaft. With the benefit of the 
present disclosure, however, those having skill in. the art will recognize that the probe 
30, and other components described herein, may be made of different size (and 
materials) according to need or desire. 

Turning to FIG. 7 and FIG. 8, it may be seen that the diagnostic portion of 
probe 30 may include forty-six optical fibers (about 200 um, NA=0.22) in two 
concentric sections. With the benefit of the present disclosure, however, those of skill 
in the art will recognize that more or fewer fibers may be used. The center bundle 80 
(See FIG. 7) may contain twenty-five fluorescence excitation fibers and twelve 
fluorescence collection fibers. At the distal end of the probe 30. these fibers may be 
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arranged randomly in central bundle 80 and may be placed in mechanical contact with 
a short piece (about 1.5 cm long) of thick quartz fiber 82. Light sent through this rod 
may be distributed over an examined area. The rod's length may be determined by the 
radius of the rod and the NA of the fibers and may be calculated by taking twice the 
radius and dividing it by the fiber NA. 

Nine fibers for illumination and collection of diffuse reflectance may be 
arranged in a ring around the fluorescence fibers (See element 84, FIG. 7). Three 
collection fibers 86 may be located at about 1 80°, two fibers 88 and 90 may be located 
at about 90°, and two fibers 92 and 98 may be located at about 45° from the 
illumination fiber 94. A single collection fiber 96 may be placed direcdy beside the 
reflectance excitation fiber in to measure single backscattered light. Fibers 92 and 98 
may have a distance to the excitation source of about 1.4 mm, fibers 88 and 90 of 
about 2.4 mm, and fibers 86 of about 3.3 mm. The distal ends of the reflectance 
fibers may be flush with the tip of the central , fiber and placed in contact with the 
sample surface. 

For measurements that take longer than ., about 30 s, an optical feedback 
mechanism for the probe operator may need to be provided to avoid a displacement of 
the instrument. Therefore, a third ring of seven fibers 100, with an offset of about 2 • 
cm (for a 28 cm probe) and about 5 cm (for a 35 cm probe) from the tip may be added 
for illumination purposes. Probe 30 may have a-screw-on protection shield 102 at the 
tip of the probe. Specularly reflected light between a quartz shield 104 and the probe 
30, however, may lead to an uncorrectable biasing of die probe performance, and 
therefore protection shield 102 may optionally not be used. A 30-minute soaking of 
probe 30 in a disinfecting solution like Cidex™ (Johnson and Johnson Inc.) allows the 
probe to be used in the sterile environment of an operating room. 

The arrangement of fibers at the monochromator 24 and the spectrograph 42 
connectors, according to one embodiment, are shown in FIG. 9 The fluorescence 
excitation fibers 108 may be arranged in two rows for optimally filling by a 
rectangular output beam of the monochromator 24. The fibers on the spectrograph 42 
end may be lined up in a single row, as shown. Fibers 1 10 are fluorescence collection 
fibers, and fibers 1 12 (represented by darkened circles) are the reflectance collection 
fibers. Because saturation in one fiber location may bloom to adjacent pixels on the 
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detector, additional spacing, realized by unconnected fibers (illustrated by un- 
darkened circles), reduced this problem. In this embodiment, the spectrograph 
connector contains fiber 114 that may be connected directly to a white light output of 
light source 22, which may be a Xe lamp, to monitor the spectral output of the light 
5 source over time. 

FIG. 10 illustrates an entire probe 30, according to one embodiment, including 
connectors and connecting fibers. Note that reflectance collection fiber 94 (See FIG. 
8), the position right next to the excitation fiber, may be interrupted by disconnecting 
SMA connector #2. This feature was created in this embodiment in case the directly 
10 backscattered light signal was too strong and needed attenuation. 

Spectrograph 42 and filter wheel 

Imaging spectrograph 42, in one embodiment, may be a commercial imaging 
spectrograph (Chromex 250 IS, Albuquerque, NM). A grating of about 100 
grooves/mm, blazed at about 450 nm may be used. With the benefit of the present 

15 disclosure, however, those of skill in the art will understand that any optical filter or 
device suitable for analyzing spectral content of light from one or mutliple sources 
simultaneously may be used for imaging spectrograph 42. 

Light collected by fluorescence and reflectance fibers and the excitation light 
guided directly from the source may be coupled through an 8-position, computer 

20 controlled collection filter wheel (Optomechanics Research, Inc., Vail, AZ), into 
imaging spectrograph 42. The filter wheel blocks the fluorescence excitation light 
from entering the spectrograph 42. The spectrograph may contain a holographic 
grating blazed at about 380 nm with about 100 grooves/mm. The fibers may be 
projected onto an entrance slit (about 250 pm) to yield a spectral resolution of about 

25 7nm. 

The non-uniform spectral response of the system may be corrected as shown in 
FIG. 11. These correction factors may be determined from measurements of 
calibration sources; in the visible, a N.I.S.T traceable tungsten ribbon filament lamp, 
and in the UV, a deuterium lamp may be used (550C and 45D, Optronic Laboratories 
30 Inc., Orlando, FL), 
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Variations in the intensity of fluorescence excitation light source at different 
excitation wavelengths may be corrected using measurements of the intensity at each 
excitation wavelength at the probe tip using a calibrated photodiode (818-UV, 
Newport). 

CCD Camera 44 

A thermo-electrically cooled CCD camera 44 (Spectrasource HPC-1, Westlake 
Village, CA) may be operated at about -30° C and may be located at the back focal 
plane of the imaging spectrograph 42. Chip dimensions may be about 13.8 x 9.2 mm 
with 1536x 1024 pixels (Kodak KAF-1600 grade 2), to yield a nominal spectral range 
of about 410 nm for a single grating position. Each fiber may take up about 40 pixels. 
The dark current of the CCD chip, in this embodiment, was specified and confirmed 
as 0.25 electrons/pixel/sec when operated at -30° C. Quantum efficiency of the 
lumogen-coated chip may range from a peak of about 40% at about 550 nm to a low 
of about 15% at about 250 nm. 

Binning Pixels 

The HPC-1 CCD camera 44 allows a user to perform on-chip binning of 
pixels. Binning means that neighboring pixels may be added together to represent 
only one data point. This feature is attractive for at least two reasons: (1) it allows a 
reduction in the time required to read data from the chip, and (2) it increases the 
signal-to-noise ratio by reducing the effective read out and shot noise. 

Although a useful feature, excessive binning may diminish the resolution of 
the system. Furthermore, because the full well capacity of the pixels and shift register 
is limited, it is possible to exceed this capacity by either grouping too many pixels 
together or by encountering an unexpectedly strong signal (blooming). When 
blooming occurs, charge in excess of the full well capacity of a capacitive element 
may spill into adjacent pixels. This can essentially fill the pixels with charge and 
render them unavailable for signal detection or perhaps give a false indication of 
signal where none exists. 

In one embodiment, binning was only electronically implemented in the spatial 
direction on the chip. The 12 fluorescence excitation fibers filled 480 pixels and were 
all binned together. For the reflectance excitation, a combined binning in hardware 
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and software was used in one embodiment. This technique had two advantages: (1) it 
increased the dynamic range compared to a full binning in hardware, and (2) it 
increased the data transfer rates as compared to non-binned data. FIG. 12 shows the 
two different binning techniques. 
5 In one embodiment, the camera 44 and the readout electronics did not operate 

in a reliable manner. Long-term testing showed that counts on every pixel can vary 
from exposure to exposure when the shutter remains closed. A DC offset variation on 
the chip, resulting in an average count of 700/pixel/s to 1500/pixel/s was monitored 
during a 12 hour period. The origin of this behavior was expected to be either a 

10 cooling problem of the CCD camera 42 or an unstable DC offset supplied to the AID 
convener. In an attempt to cure at least some of the problems, a higher number of 
pixels were digitized that were actually physically present. The count of these fake 
pixels reflected the DC offset of the signal and was found to be independent of the 
detector temperature. Testing showed that the count; of these fake pixels varied the 

15 same way as the real pixels did. In this embodiment* monitoring of the background 
was required at every single measurement, since a low fluorescence signal may lie in 
this range. The background could be subtracted from the acquired data. In the 
embodiment, another problem was discovered ; widths readout of the chip. The first 
electronically binned line that was read out was always corrupted and had to be 

20 discharged. This meant that the double amount of pixels were binned into two 
columns from which the first corrupted one was dumped. 

Software and Control l! 

In one embodiment, National Instruments Labview Version 3.0 (Austin, TX) a 
graphical programming development environment based on the G (Graphic) 

25 programming language may be used to control Fast EEM system 10. The platform for 
the control software may be any suitable control device or computer 50. In one 
embodiment, a laptop 486/75 MHz personal computer with docking station (Austin 
Inc., Austin, TX) was used as computer 50. Communication with the excitation 
monochromator may be provided via an RS-232 control module that is interfaced to 

30 the COM port of the docking station of computer 50. A camera control card may be 
mounted in the docking station. The imaging spectrograph 42 may be operated using 
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a National Instruments GPB3 IEEE-488 board that is also located inside the docking 
station of computer 50. 

In another embodiment, a desktop computer was chosen (Optiplex 233GXa, 
Dell Computer Corporation, Round Rock, TX) equipped with a Windows95™ 
operating system as computer 50. All mentioned cards in this embodiment may be 
connected to the ISA-bus of computer 50. A double monochromator 24 and 
spectrograph 42 controls may be connected by a GPIB IEEE-488 interface (AT- 
GPIB/TNT, National Instruments, Austin, TX). The two shutters and the filter wheel 
may be controlled with a digital I/O card (PC-DIO-24, National Instruments, Austin, 
TX). The CCD camera 44 may have its own ISA-bus interface card. The readout rate 
of the chip in this embodiment was greater than about 65,000 pixels/s. This gave a 
readout time of about 24 s for the whole chip if no binning was used. In this 
embodiment, no on board RAM was available to buffer acquired data. 

In one embodiment, Labview V.5.0 (National .Instruments, Austin, TX) was 
chosen as the software to control the entire Fast EEM system 10. In this embodiment, 
the goal of software development was to create an easy to use interface that made the 
system controllable by an operator with basic computer knowledge after only a few 
days of training. 

Such software may be designed using a small number of basic sub-Vi's (Vi: 
virtual instrument. National Instruments' expression for software units). Operator 
interaction may be minimized to avoid human errors.; Automation of file saving and 
auto-naming of saved files may be implemented; to prevent loss of data by mislabeling 
or accidentally overwriting certain files. Such, automation may also speed up the 
interaction time of an operator with the software .between measurements. 

In one embodiment, stored fluorescence data was loaded immediately after 
storage and could be visually inspected in the center of the screen. Such a routine may 
be added as a quality-ensuring feature, and it may also help to prevent data loss caused 
by saving errors or misalignment of the system if the operator was experienced in 
interpreting the acquired data. 
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Software Structure 

FIG. 13 shows a main user interface according to one embodiment from which 
the Fast EEM system 10 may be controlled. With the benefit of the present 
disclosure, those having skill in the art will understand that there are numerous ways 
in which system 10 may be controlled and that the interface shown in FIG. 13 is but 
only one of those ways. Other user interfaces may be implemented as is known in the 
art. In FIG. 13, the center displays show four spectra of the last fluorescence 
measurement (top graph) and the acquired reflectance data (bottom). The excitation 
wavelengths of the displayed spectra may be changed online. Around this screen, 
different buttons may be present, which allow access to the certain main features. 

In the configuration component of the software interface illustrated in FIG. 13, 
all the configurations were accessible and controllable. In the 'Saving parameter* sub 
program, a patient number and the directory path, may be defined. The integration 
time for the individual exposures and the settings of the CCD camera 44 may be 
stored in the corresponding subroutine. The Spectrograph settings may be changed in 
the 'Chromex'-Vi. The buttons for the mercury calibration, the lamp monitoring, and 
the power output of the probe may also be associated with the configuration settings 
of the software. 

In regard to acquiring date, individual switches for starting the background and 
the standards measurements may be placed on the left side of the spectra display. The 
fluorescence, reflectance and combined reflectance and fluorescence measurements 
may be initiated in the 'Main Measurements* box.. Naming of files with the acquired 
data may be dependent on which kind of measurement is chosen. In one embodiment, 
no manual naming of files by the operator was necessary. 

Many additional features may be added to the software and user interface. For 
example, an image of the whole CCD chip with all possible settings and binnings may 
be achieved. The monochromator 24 may be moved to any desired wavelength. The 
center wavelength of the spectrograph 42 may be set manually, too. The camera's 44 
exposure time may be adjusted, and it may possible to choose if the shutter of the 
spectrograph 42 should open or if it should remain closed to image the dark current. 
Another sub Vi may be designed to change all the settings of the monochromator 24, 
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such as wavelength, and slit width. Emission and reflectance spectra may be loaded 
and visually compared on the screen. It may be possible to turn on and off the probe's 
illumination light from the main screen. It shall be understood that none of these extra 
features need influence the settings for the main measurements. Default values may 

5 always be restored when measurements are started. When exiting the software, a 
protocol file may be created that contains all the important settings, the date, file 
names and the name of the operator. In one embodiment, about 1 12 individual Vi's 
were created to design a reliable, easy-to-use and fault-proof system, although it will 
be understood that more or fewer routines may be implemented according to the needs 

10 or desires of the user. In other embodiments, for instance, a simpler or more 
complicated user interface may be easily implemented as is known in the art. 

Temporal Performance 

Table 2.1 compares the temporal performance of the two embodiments of Fast 
EEM systems described above — one utilizing a 150 W ozone free Xe arc lamp, single 

15 monochromator, and twenty-one fiber probe (Embodiment A); and the other system 
using a 450 W ozone free Xe arc lamp, a double monochromator, and a forty-six fiber 
probe (Embodiment B). Overall, the time to obtain a complete EEM in Embodiment 
B between 330 nm and 500 nm excitation in steps of 10 nm was cut down to less than 
45 s, a temporal improvement of 105 seconds over Embodiment A. To obtain the 

20 same amount of counts* on the CCD chip, the exposure times may be cut down from 
1500 ms to 200 ms, depending on the excitation wavelength. An exposure time of 
375 ms may be expected since the amount of light delivered to the tissue may increase 
by a factor of 4. The alignment on the emission side was improved in Embodiment B, 
so that the throughput was almost twice as much as before. The monochromator' s 

25 scanning speed may be decreased from 34 s for an entire scan and resetting to the 
starting wavelength to less than 3 s. A faster computer and the use of a 32-bit 
operating system in Embodiment B cut down the computation time by almost 50%. 
However, it still required about 2 s per exposure to transfer the data from the camera 
to the computer. This value adds up to 42 s, 75% of the whole data acquisition time. 

30 This handicap may be further improved by replacing the readout electronics of the 
CCD chip. The control of the illumination shutter* a new feature of the system, did 
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not add any extra time to the measurements. The shutter opened and closed in less 
than 5 ms. 

In embodiment B, reflectance measurements may be sped up by using a 
200 urn fiber for the excitation light instead of a 80 urn fiber, since more light is 
5 provided to the sample 60, which may be a tissue. A more intense white light output 
of the system may serve the same purpose. By using a different imaging spectrograph 
42 with a grating with lower spectral dispersion, a wider spectrum may be covered on 
the CCD chip. To cover the desired spectral range for reflectance measurements, only 
two (instead of three) sub-range exposures may be necessary. Overall data acquisition 
10 time over 2 wavelength ranges and four positions may be achieved in 31 s in 
Embodiment B, which is about three times faster than that in Embodiment A, in 
which only 3 spatial positions had to be exposed 
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Table 2. 1 Comparison of Temporal Performance: 
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EyinDouiment j> 


Fluorescence 






Scanning time: 

2 x 500 nm - 330 nm 


2 x 170 nm / 10 nm/s =34 s 


2x 70 nm / 150 nm/s =2.7 s 


Exposure time 


18xl.5s = 27s 


20 exposures: 

2 = 6.0 s (see 3.1.2) 


Moving filter wheel 


8xls=8s 


8x 1 s = 8s 


Camera shutter, data 
transport 


18x4.5 s = 81 s 


21x2s = 42s 


illumination shutter 








Z=l50s 


1 = 53.7 s 


Reflectance 






Exposure time 


9 exposures: 27 s 


8 exposures: 6 s 


Camera shutter, data 
transport 


63 s 


25 s | 




Z=90s 


I =31 s 



In summary, a combined reflectance and fluorescence measurement with the 
Embodiment B may be obtained in 85 s, about three times faster than with the 
Embodiment A. This temporal improvement may benefit the pauent and may also 
minimize the chance that the physician moves the probe during measurements. 

The following examples are included to demonstrate preferred embodiments 
of the invention. It should be appreciated by those of skill in the art that the 
techniques disclosed in the examples which follow represent techniques discovered by 
the inventor to function well in the practice of the invention, and thus can be 
considered to constitute preferred modes for its practice. However, those of skill in 
the art should, in light of the present disclosure, appreciate that many changes can be 
made in the specific embodiments which are disclosed and still obtain a like or similar 
result without departing from the spirit and scope of the invention. 
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EXAMPLE 1 

Fluorescence Excitation Emission Matrices of Human Tissue: A System for In vivo 
Measurement and Method of Data Analysis 

5 This example describes a Fast EEM system capable of measuring spatially 

resolved reflectance spectra from 380-950 nm and fluorescence excitation emission 
matrices from 330-500 nm excitation and 380-7Q0 nm emission in vivo. System 
performance was compared to a standard scanning; spectrpfluorimeter. This FastEEM 
system was used to interrogate human normal and neoplastic oral cavity mucosa in 

It) vivo. Measurements were made through a fiber optic probe and required about 4 
minutes total measurement time. This example also presents a method based on 
autocorrelation vectors to identify excitation and emission wavelengths where the 
spectra of normal and pathologic tissues differ most. The FastEEM system provides a 
tool with which to study the relative diagnostic ability of changes in absorption, 

15 scattering and fluorescence properties of samples, including tissue samples. 

Materials and Methods: 

FIG. 14 illustrates a block diagram of a Fast EEM system 10 in accordance 
with the present disclosure. This system includes at least three main components: (1) 
an arc lamp 22, stepper motor driven monochromator 24 and filter wheel, which 

20 provides monochromatic and broad band excitation, (2) a fiber optic probe 30 which 
directs excitation light to the sample 60, which may be a tissue sample, and collects 
remitted fluorescence from, in this embodiment, one location and diffusely reflected 
light from, in this embodiment, three locations, and (3) a filter wheel, imaging 
spectrograph 42 and CCD camera 44 which detects the spectrally resolved reflectance 

25 and fluorescence signals. Excitation monochromator position, filter wheel position, 
spectrograph grating position, CCD operation and data acquisition are controlled 
using a laptop personal computer 50 mated to a docking station. The specifications of 
each sub-system are described below. 

The probe 30, illustrated in FIG. 15, included a total of forty-six optical fibers 

30 (200 nm diameter, NA=0.2) arranged in two concentric bundles. The center bundle 
contained twenty-five fluorescence excitation fibers and twelve fluorescence 
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collection fibers. The proximal ends of the fluorescence excitation fibers were 
arranged in two vertical lines at the exit slit of the excitation monochromator 24 to 
maximize the coupling of the light into the sample. The proximal ends of the 
fluorescence collection fibers were arranged in a single vertical line at the entrance slit 
of the imaging spectrograph 42. At the distal end of the probe 30, the fibers that 
excite and collect fluorescence were arranged randomly in a central bundle and placed 
in contact with a short piece of a thick quartz fiber (2 mm diameter, 15 mm long, 
NA=0.2). The distal tip of this fiber was placed in contact with the sample surface 60, 
and ensured that the area from which fluorescence was collected was the same as that 
directly illuminated. 

The nine fibers for illumination and collection of diffuse reflectance were 
arranged in a concentric ring around the thick quartz fluorescence measurement fiber. 
The distal ends of these fibers were flush with the tip of the central fiber and were 
placed in contact with the sample surface 60. White light from a port on the side of 
the lamp housing was coupled to the proximal end of a single illumination fiber (80 
urn, NA 0.2). Photons that scatter through the. -.tissue and exit the surface were 
collected at four different positions with seven collection fibers; three located 180° 
from the iUumination fiber (3 mm distance), two located 90° from the illumination 
fiber (2.1 mm) and two located 45° from the Ulumination fiber as shown (1.1 mm) 
(See HG. 15). The proximal ends of the reflectance collection fibers were situated at 
the top of the vertical line of fluorescence collection fibers, separated by dummy 
fibers as shown in FIG. 15. 

The light source 22 for the instrument, which provided both quasi- 
monochromatic excitation for fluorescence and broad band mumination for 
reflectance, was a 150 W ozone free Xe arc lamp, (Spectral Energy Corp., Westwood 
NJ) with a spherical rear reflector. A condenser system consisting of two plano- 
convex quartz lenses was used to couple light into monochromator 24. The primary 
condenser was 1.5 inches in diameter with an aperture ratio of f/1.5. The secondary 
condenser was also 1.5 inches in diameter, but was masked to provide numerical 
aperture matching to the monochromator 24. A manual shutter was located between 
the condensing optics and monochromator 24 and was closed to.prevent fluorescence 
excitation light from reaching the sample 60 during reflectance measurements. The 
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monochromator 24 had an aperture ratio of f/3.6 (Spectral Energy, GM 252) and was 
used with an ion-etched holographic grating (ISA, Edison, NJ, 240 nm blaze, 1180 
grooves/mm, dispersion = 3.3 nm/ram). An RS-232 controlled stepper motor drove 
the monochromator 24 with a maximum stepping rate of about 400 step/sec (about 10 
5 nm/sec). A bandwidth of 6.6 nm was selected by setting the entrance slit of the 
monochromator to about 2.0 mm. light was coupled from the monochromator 24 
into the probe 30 via a fiber optic adapter (Spectral Energy, GMA 257) consisting of a 
quartz plano-convex lens and a 5X quartz microscope objective. The light passing 
through the objective was focused onto a vertical line of 25 fibers in two columns, 

10 placed at the focal plane of the objective (See FIG. 15). The reflectance excitation 
fiber was attached to the lamp housing via a micropositioner. Broadband light exiting 
the lamp housing through an existing hole was coupled to the reflectance illumination 
fiber using a quartz plano-convex lens (NA=0.24). A five position illumination filter 
wheel placed between the lamp and the lens contained three long pass filters with 50% 

15 transmission at 295 nm, 515 nm and 715 nm. One of the filter positions was blocked 
and acted as a shutter to prevent white light from reaching the sample during 
fluorescence measurements. . - ; 

light collected by fluorescence and reflectance fibers was coupled through an 
8 position, computer controlled collection filter wheel, into a Chromex 250 IS 

20 (Albuquerque, NM) imaging spectrograph 42 containing a holographic grating blazed 
at 380 nm with 150 grooves/mm and a reciprocal linear, dispersion (RLD) of 20 
nm/mm. The fibers were projected onto an entrance slit (25Q \xm) which yielded a 
spectral resolution of about 5 nm. A thermo-electrically cooled CCD camera 44 
operated at about -30° C (Spectrasource HPC-1, Westlake Village, CA) was located at 

25 the back focal plane of the imaging spectrograph 42. Chip dimensions were 13.8 x 
9.2 mm with 1536 x 1024 pixels (Kodak KAF-.1600 grade 2), yielding a nominal 
spectral range of about 276 nm for a single grating position. Dark current was 
specified as 0.25 electrons/pixel/sec when operated at -30° C. The quantum efficiency 
of the lumogen coated chip ranged from a peak of 40% at 550 nm to a low of 15% at 

30 250 nm. 

The detector and imaging spectrograph . were wavelength calibrated by 
measuring the room light spectra that showed three Mercury, peaks at 404.7, 436 and 
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546 nm. The relation between pixels and wavelength was then linearly fitted through 
these points. 

Fluorescence and reflectance measurements were obtained sequentially. Prior 
to fluorescence measurements, the white light port was closed and pixels illuminated 
by the fluorescence fibers were selected to be read from the CCD 44. Dark current 
and A/D conversion offset was measured with the same setting as the subsequent 
measurement but with a closed camera shutter. These were subtracted from all 
fluorescence and reflectance measurements. The first excitation wavelength was 
selected by scanning the excitation monochromator, the emission filter wheel was 
rotated to select the appropriate long pass filter and the spectrograph grating was 
adjusted to record signal over the desired emission wavelength range. The 
monochromator 24 and camera shutters were then opened for the desired exposure 
time to record the fluorescence emission spectrum (1.5 seconds). The excitation 
wavelength was then incremented, and the process; repeated until all desired excitation 
wavelengths have been measured. The excitation wavelengths were incremented from 
330 to 500 nm in 10 nm steps. Table 1 contains* list of the excitation wavelengths 
and corresponding long pass filters and emission wavelength ranges used in this 
Example. . 

Following collection of fluorescence spectra, diffuse reflectance spectra were 
then measured. For these measurements, the monochromator shutter was closed, the 
emission filter wheel was set to the lowest filter position and the pixels illuminated by 
the corresponding reflectance collection fibers Were selected to be read from the CCD 
44. Dark current and A/D conversion offsets were measured and stored for subtraction 
of the following measurements. The reflectance spectrum was collected over three 
mumination wavelength ranges. Prior to measurement of each range, the appropriate 
long-pass filter was selected in the mumination filter wheel, and the spectrograph 
grating was adjusted to record signal over the desired wavelength range. The lamp and 
camera shutters were then opened for the desired exposure time to record the 
reflectance spectrum (0.4 - 4.8 seconds). The illumination wavelength range was then 
incremented, and the process repeated until all desired wavelength ranges have been 
measured. Exposure times were determined empirically to achieve a signal to noise 
ratio greater than 20. Table 1 contains a list of the illumination wavelength ranges 
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and corresponding long pass filters used for diffuse reflectance measurements. The 
high dynamic range of the reflectance measurements, spanning over three orders of 
magnitude, required that each spatial position be read out individually from the CCD 
44. This prevented saturation and blooming artifacts. 

There are no accepted safety standards for illumination of mucosal surfaces 
other than skin and cornea. However, the exposure of solar radiation that is 
equivalent to the exposure received when a measurement is made with this system has 
been calculated. The method compares the spectral irradiance [W/cm 2 nm] of the 
excitation source with solar irradiance data obtained from [NSF Polar Programs UV 
Spectroradiometer Network 1994-1995 Operations Report; NSF UV Radiation 
Monitoring Network 1994 to 1995 Volume 5.0 Data Set. Available at 
WWW.BIOSPHERICAL.COM.]. The comparison includes a point-wise division of 
the irradiance from the FastEEM system to the solar irradiance at the same 
wavelength. This ratio gives a relative solar exposure, factor. The solar data is for a 
sunny day in San Diego, California. Irradiation during fluorescence excitation is less 
than 7 times solar exposure at all wavelengths, Oiven that fluorescence excitation 
times were 1.5 seconds, this corresponds to exposure to solar radiation for less than 1 1 
seconds in any given wavelength band. During diffuse reflectance measurements, the 
lamp exposure is maximum at 300 nm, where the relative exposure is a factor of 25 
that of the sun. Since the total exposure time for this wavelength band is 14 seconds, 
the exposure corresponds to 350 seconds or less than ; 6 minutes. All other 
wavelengths have relative exposure factors of 10 or less resulting in a shorter 
equivalent total solar exposure. 

Prior to every patient measurement the probe output was measured with a 
calibrated power meter (Newport, Irvine, CA* 818-UV) at 400 nm excitation 
wavelength. An average output of 86 jiW +/- 12 uW was achieved at this wavelength 
with a bandwidth of 6.6 nm. Background fluorescence spectra were measured with 
the probe dipped in a non-fluorescent bottle containing distilled water. This 
background EEM was subtracted from all subsequently acquired EEMs to correct for 
room lights and probe autofluorescence. The non-uniform spectral response of the 
system was corrected using correction factors determined from measurements of 
calibration sources; in the visible a N.I.S.T traceable tungsten ribbon filament lamp 
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and in the UV a deuterium lamp were used (550C and 45D, Optronic Laboratories 
Inc., Orlando, EL). Variations in the intensity of fluorescence excitation light source 
at different excitation wavelengths were corrected using measurements of the intensity 
at each excitation wavelength at the probe tip using a calibrated photodiode (818-UV, 
5 Newport). Background spectra to correct reflectance measurements for room light 
contributions were measured with all parameter? set as for tissue measurements 
except the white light shutter was closed. These measurements were subtracted from 
all subsequent reflectance spectra. 

Fluorescence and reflectance standards were measured before each patient 

10 measurement. The fluorescence intensity was reported relative to the fluorescence 
intensity of a solution of 2 mg/L Rhodamine 610 (Exciton, Dayton, OH) in ethylene 
glycol at 460 nm excitation and 580 nm emission. Reflectance data are reported 
relative a 2.68% by volume solution of 1.072 micron diameter polystyrene 
microspheres (Polyscience Inc., Warrington, PA)*. The microsphere standard was used 

15 for its well-characterized optical properties. The total, integrated reflectance of this 
standard was measured on a double beam spectrophotometer (U-3300 Hitachi, Tokyo, 
Japan) with an integrating sphere attachment (Labsphere Jnc, North Sutton, NH). This 
. was used to correct the reflectance standard measurements made with the FastEEM 
system. Tissue spectra at each collection fiber position were divided pointwise by the 

20 corrected standard reflectance spectrum at the corresponding fiber position. 

The EEMs were assembled offline from each series of fluorescence emission 
scans. Data processing and plotting were performed with Matlab, (The Math Works 
Inc., Natick, MA). Reflectance spectra were assembled from three wavelength areas 
giving a range from 380 to 950 nm. The wavelength range was further reduced (380 - 

25 800 nm) to comply with the range of calibration measurements of the reflectance 
standards on the U-3300. Reflectance data were reported between 380 and 595nm, a 
range where the possible influence of room lights in the measurement was minimized. 

System Validation 

System performance was assessed using two fluorescence standards. The first 
30 standard was a 2 mg/L Rhodamine 610 (Exciton Inc., Dayton, OH) ethylene glycol 
solution that is non-scattering, but has peak fluorescence intensity approximately 
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twice the average intensity of human cervix. The second standard mimics the optical 
properties of tissue and consists of 20 pM Flavin Adenine Dinucleotide (FAD, Kodak, 
Rochester, NY), 0.625 vol% polystyrene micro spheres (Pplyscience Inc., diameter = 
1.072 pm). 

Both standards were measured with the FastEEM system 10 and a scanning 
spectrofluorimeter (SPEX, Fluorolog H, Edison, NJ). The EEMs measured with the 
SPEX were considered as standards since the performance of the system is well 
documented (dynamic range=10 5 , spectral resolution 5 nm, corrected for non-uniform 
spectral response). The excitation light was incident perpendicular to the sampling 
cuvette and the emitted light was collected at approximately a 20 degree angle with 
respect to excitation light. A front focus arrangement with a 10 mm cuvette was used 
in the SPEX. 60 minutes were required to collect a full EEM from each sample with 
the SPEX . 

Clinical Studies v -'* 

In vivo data were obtained from a group of patients with a known or suspected 
premalignant or malignant lesions of the oral cavity. The studies were reviewed and 
approved by the Internal Review Board of the University of Texas at Austin and the 
Surveillance Committee at the UT MD Anderson Cancer Center (Houston). Informed 
consent was obtained from each person in the study. Before using the probe, it was 
disinfected with Metricide (Metrex Research Corp.) in accordance with the standard 
clinical protocol. Background fluorescence EEM and reflectance spectra were 
measured by dipping the fiber optic probe in a non-flubrescent bottle filled with 
deionized water. These EEMs and spectra correspond to the system autofluorescence, 
and were subtracted from all subsequently acquired EEMs for that patient. Next an 
EEM was measured from a Rhodamine calibration standard and a reflectance 
spectrum was measured from a polystyrene solution calibration standard. The probe 
was then guided to the tissue site to be examined and its tip positioned flush with the 
tissue. A fluorescence EEM and reflectance spectra were obtained from sites within a 
lesion and a clinically normal site. Post-spectroscopy, a 2-4 mm biopsy of the tissue 
was taken from normal and abnormal sites where the probe measured spectra. These 
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specimens were evaluated by an experienced pathologist, Bonnie Kemp, M.D., using 
light microscopy and classified using standard diagnostic criteria. 

Data Analysis 

One of the goals of the Fast EEM instrument 10 is to provide information for 
5 the identification of excitation wavelengths suitable for the differentiation of tissue of 
differing pathological characteristics, as well as identification of the chromophores 
responsible for the differences. While all such information is present in the EEMs 
collected, it can be difficult to extract due to the dimensionality of the data set. A 
method was devised to separately characterize the excitation and emission 
10 characteristics of the data set. 

Given that the EEM has dimensions corresponding to (X x , the following 
autocorrelation vectors are defined: 

^(^) = Z i >EM(^,X roi >EEM(x x ,X m| ) : :> 
m.(xJ = £ N =1 EEM(x Xi ,X ro >EEM^ i ,^) , 

where x av (^x) is the excitation autocorrelation vector and ma V (X ro ) is the emission 

15 autocorrelation vector. Essentially, the emission autocorrelation vector is the diagonal 
of the product of the EEM with its transpose, and the excitation autocorrelation vector 
is the diagonal of the product of the transpose of the EEM with the EEM. Note that in 
signal processing terms, the autocorrelation vectors, x av and m^, are a measure of the 
average signal of the EEM at each excitation or emission wavelength, respectively. In 

20 this way they provide qualitative information about and EEM. 

An example with simulated data is presented in FIGS. 16A and 16B to 
illustrate how autocorrelation vectors reflect changes in fluorescence peak positions in 
EEMs. Two kinds of changes are simulated in the modeled data: a shift in the 
excitation wavelength at which a fluorescence peak appears, and a shift in the 

25 emission wavelength at which a fluorescence peak appears. The original peak in the 
EEM was modeled as a single gaussian at 380 nm excitation, 550 nm emission with a 
FWHM of 35 nm in emission and excitation wavelengths. The original peak was then 
shifted by 30 nm in excitation as shown by arrow 1 in FIG. 16 A. The shift in 
emission wavelength is shown by arrow 2 in FIG. 16 A, and corresponds to a 30 nm 

30 shift in the emission peak of the original data. Three sets of autocorrelation vectors 
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were computed: one for the EEM with the original peak, one for the EEM with the 
excitation wavelength-shifted peak, and one for the EEM with the emission 
wavelength-shifted peak. The autocorrelation vectors are shown in FIG. 16B. 
Comparing the vectors for the original EEM (row 1 in FIG. 16B) with the vectors 
from the EEM with the excitation wavelength-shifted EEM (row 2 in FIG. 16B), it is 
seen that the excitation autocorrelation vector is sensitive to the change in excitation 
wavelength but not in emission wavelength. Similarly, comparing the autocorrelation 
vectors for the original EEM with the vectors from the EEM with the emission 
wavelength shift in the peak (row 3 in FIG. 16B) shows that the emission 
autocorrelation vector is sensitive to the changes in emission wavelength but not 
excitation wavelength. 

It is sometimes desirable to normalize the autocorrelation vectors to facilitate 
comparisons between different sets of measurements. Normalized autocorrelation 
vectors have been calculated by dividing these vectors by their RMS value, in effect 
forcing the area of the vector to one unit of signal energy. The normalized emission 
autocorrelation vector is well suited for the identification of differential features in 
EEMs, such as the shifting or broadening of fluorescence peaks. 

Results and Discussion: 

FIGS. 17A and 17B show fluorescence EEMs of the non-scattering 
Rhodarnine standard and the scattering FAD phantom obtained with a FastEEM 
system 10. Intensities are reported relative the Rhodarnine intensity measured at 460 
nm excitation and 580 nm emission wavelength. FIGS. 18A and 18B show 
fluorescence emission spectra of the Rhodarnine standard obtained at 370 and 450 nm 
excitation with the SPEX and the FastEEM system 10 as well as the fluorescence 
background. FIG. 18B and 18D show the same spectra for scattering FAD phantom 
obtained at the same excitation wavelengths, the spectra are normalized at their 
maximum. Note the presence of Rayleigh scattering peaks from the excitation source 
in the data taken with the SPEX. In general, from non-scattering samples (FIG. 18 A, 
18Q the FastEEM system 10 collects less light above 600 nm than the SPEX. This 
may be due to the different collection efficiencies of the FastEEM probe and the front 
face collection geometry of the SPEX. Under scattering conditions and with lower 
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fluorescence signal, the influence of background fluorescence becomes more critical. 
At 370 nm excitation wavelength the FastEEM system l6 measures more fluorescence 
below 500 nm. A comparison with the measured fluorescence background however 
shows that the additional signal has the same shape as the background. It has been 
hypothesized that the background may have been underestimated by measuring it in a 
non-scattering non-fluorescent media. 

In-vivo fluorescence EEMs of the oral cavity were measured from 71 sites and 
in-vivo reflectance spectra were measured from 49 sites, these were obtained from 
patients in two studies. The first study included patients with abnormal oral lesions 
identified in a previous medical examination (17 patients). The second study, 
contributing nine patients, was of normal volunteers. All sites interrogated 
spectroscopically in patients with lesions were biopsied and submitted for 
histopathological analysis. Spectra and biopsies were also obtained from a 
contralateral site with no lesion in these patients with abnormal lesions. These 
biopsies were also evaluated histopathologically. No biopsies were taken from the 
normal volunteers. In this Example, the inventors show representative EEMs from 
tissue found to be histopathologically normal: arid malignant to illustrate spectral 
features detectable with the FastEEM system. ■ 

Two EEM contour plots from a normal and an abnormal area of the tongue are 
presented in FIGS. 19A and 19B, respectively. In the normal sample, fluorescence is 
observed throughout the whole collection range,, with a peak located at 330/380 
(excitation/emission) and a ridge extending from 340/450 to 450/500. Table II lists 
excitation-emission maxima pairs of endogenous tissue chromophorcs. Comparison 
of the observed peaks with Table U shows these peaks are consistent with the 
emission of structural proteins such as collagen, and elastin, pyridine nucleotides 
(NADH) and flavoproteins (FAD). The normal site shows overall increased 
fluorescence with respect to the abnormal site shown in FIG. 19B. The abnormal site, 
assessed by a pathologist as being moderately differentiated squamous cell carcinoma, 
also shows broad fluorescence throughout. Peaks are. observed at 330/380, 350/460, 
460/520 and 500/630. A valley is seen at 420 nm excitation between 560 and 580 
emission. This valley is seen to extend along the 420 nm excitation line as well as the 
580 nm emission line. Table m suggests that these features are produced by 
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hemoglobin reabsorption. Hemoglobin reabsorption may also in pan account for the 
shift in the peaks of the abnormal EEM relative to : the normal EEM. A summary of 
the excitation and emission maxima for the peaks observed in the normal and 
abnormal sites measured is presented in Table IV. 

5 Fluorescence emission spectra at three selected excitation wavelengths are 

shown in FIG. 20, illustrating changes in relative intensities of fluorescence emission. 
For comparison purposes each set (normal/abnormal) was normalized to the 
maximum at 350 nm excitation. FIG. 20(a) shows the emission spectra at 350 am 
excitation. Fluorescence from the normal site is seen as a broad peak with a maximum 

10 at 455 nm. The peak from the abnormal site is . seen to be narrower and red-shifted. 
Examination of this spectrum at 410, 540 and 580 nm suggests that the change in 
lineshape is due to oxygenated hemoglobin. The general line shapes of the 
fluorescence observed at 410 nm excitation (FIG, 20(b)) are seen to be similar for 
both sites in the 450-575 nm excitation range., ;jvith a broad peak at 500 nm. The 

15 abnormal site shows a significantly lower fluorescence intensity, as well as an extra, 
narrow fluorescence peak at 640 nm, attributed to porphyrin fluorescence. FIG. 20(c) 
shows the emission spectra at 460 nm excitation. The normal site shows a broad peak 
at 520 nm and clear modulation from hemoglobin reabsojrption at 540 and 580 nm. 
Fluorescence from the abnormal site shows an even more marked hemoglobin 

20 reabsorption; also the overall fluorescence intensity is reduced. 

FIGS. 21A and 21B show the emission and excitation autocorrelation vectors 
for the same measurements. Note that the plots have a logarithmic y-axis. The 
emission autocorrelation vectors have a large broad peak at 460 nm corresponding to 
the main fluorescence peak observed in the EEMs. The vectors show the effect of 

25 hemoglobin absorption around 410, 540 and 580 nm in the abnormal site and the 
presence of additional fluorescence in the UV in the normal sample (FIG. 21A). This 
autocorrelation vector also highlights the peak at 610 nm in the abnormal sample. 
The excitation autocorrelation vectors show different line shapes. The curve 
corresponding to the normal site decreases steadily from 330 nm to 500 nm excitation. 

30 The curve from the abnormal site shows a peak at 350 nm and a minimum at 410 nm. 
The latter illustrates the greater influence of hemoglobin reabsorption in the abnormal 
sample also shown in FIG. 20. 
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The corresponding reflectance data is shown in FIGS. 22A-22C. Position 1 
corresponds to the collection fibers closest to the source fiber and position 3 to those 
furthest from the source fiber as shown in FIG. 15. The difference in position allow 
for spatially resolved reflectance measurements. Differences induced by the 
fluorescence reabsorption of oxygenated hemoglobin in the normal site and abnormal 
site are shown. The modulation of the spectrum, by the 540 and 580 absorption bands 
is seen to be significantly stronger in the abnormal sample; this is consistent with the 
increased reabsorption seen in the fluorescence spectrum of the abnormal sample. 
The reflectance in the blue range (450-500nm) of the abnormal site is consistently 
higher than that of the normal site. Below 450 nm the reflectance seems not to differ 
between the normal and abnormal samples. 

Conclusions 

The total data acquisition time for the data presented here was 2.5 minutes for 
a fluorescence EEM, and 1.5 minutes for the spatially resolved reflectance 
measurements. However, only 29 seconds of this time represented fluorescence 
collection. Actual reflectance collection time was 26 seconds. The most time 
consuming process was changing the excitation wavelength using the stepper motor 
controlled excitation spectrograph and changing the corresponding long-pass filter 
using the remotely controlled filter wheel. Worm drive based monochromators are 
available (DDD180, ISA) which require less than 1 6 seconds to scan our entire 
wavelength range in 10 nm steps, and could substantially reduce the total 
measurement time. Using a higher power lamp may further reduce acquisition time of 
both fluorescence and reflectance. 

This Example has demonstrated the acquisition of EEMs in combination with 
spatially resolved reflectance measurements of tissue phantoms and in the oral cavity 
in vivo with good signal to noise ratio. The system features easy and arbitrary 
selection of excitation wavelengths in the UV and visible range. The system is also 
portable, and capable of functioning in a hospital operating room. Probes used in the 
Fast EEM system incorporate channels to measure spatially resolved reflectance and 
fluorescence, and are built small enough (less ihan about 5mm) to be used during 
endoscopic surgical procedures. Autocorrelation vectors x BV and m. v are a suitable 
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method to reduce the data set while preserving information about the wavelength 
bands carrying information. Based on the representative data shown here, fluorescence 
emission and excitation as well as reflectance data appear promising for the 
identification of tumors of the oral cavity. The Fast EEM system is an ideal tool to 
identify a subset of the most promising optical features to identify pathological 
findings in large clinical studies. 
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EXAMPLE 2 

Cervical Pre-Cancer Detection Using A Multivariate Statistical Algorithm Based On 
Laser Induced Fluorescence Spectra At Multiple Excitation Wavelengths 

5 A portable fluorimeter was developed and utilized to acquire fluorescence 

spectra from 381 cervical sites in 95 patients at 337, 380 and 460 nm excitation 
immediately prior to colposcopy. A multivariate statistical algorithm was used to 
extract clinically useful information from tissue spectra acquired in vivo. Two full- 
parameter algorithms were developed using tissue fluorescence emission spectra at all 

10 three excitation wavelengths (161 excitation-emission wavelength pairs) for cervical 
pre-cancer (squamous intraepithelial lesion (SIL)) detection: a screening algorithm 
which discriminates between SILs and non SILs with a sensitivity of 82%± 1.4 and 
specificity of 68%±0.0, and a diagnostic algorithm which differentiates high grade 
SILs from non high grade SILs with a sensitivity and specificity of 79%±2 and 

15 78%±6, respectively. Multivariate statistical analysis was also employed to reduce the 
number of fluorescence excitation-emission wavelength pairs needed to re-develop 
algorithms that demonstrate a minimum decrease in classification accuracy. Two 
reduced-parameter algorithms which employ fluorescence intensities at only 15 
excitation-emission wavelength pairs were developed: the screening algorithm 

20 differentiates SILs from non SILs with a sensitivity of 84%±1.5 and specificity of 
65%±2 and the diagnostic algorithm discriminates high grade SILs from non high 
grade SILs with a sensitivity and specificity of 78%±0.7 and 74%±2, respectively. 
Both the full-parameter and reduced-parameter screening algorithms discriminate 
between SILs and non SILs with a similar specificity (±5%) and a substantially 

25 improved sensitivity relative to Pap smear screening. A comparison of the full- 
parameter and reduced-parameter diagnostic algorithms to colposcopy in expert hands 
indicated that all three have a very similar sensitivity and specificity for differentiating 
high grade SILs from non high grade SILs. 

This paper presents the development and application of a detection technique 

30 for human cervical pre-cancer based on laser induced fluorescence spectroscopy. A 
portable fluorimeter consisting of two nitrogen pumped-dye lasers, a fiber-optic probe 
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and a polychromator coupled to an optical multi-channel analyzer was utilized to 
acquire fluorescence spectra from 381 cervical sites in 95 patients at three excitation 
wavelengths: 337, 380 and 460 nm. A general multivariate statistical algorithm was 
then used to analyze and extract clinically useful information from tissue spectra 
acquired in vivo. First, a screening algorithm was developed to discriminate between 
SILs and non SILs (normal squamous and columnar epithelia and inflammation); 
second, a diagnostic algorithm was developed to differentiate HG SILs from non HG 
SILs (LG SILs, normal epithelia and inflammation). The retrospective and prospective 
accuracy of both the screening and diagnostic algorithms were compared to the 
accuracy of Pap smear screening and to colposcopy in expert hands. 

The general multivariate statistical algorithm was initially developed and 
tested using cervical tissue spectra acquired at 337 nm excitation from 476 cervical 
sites in 92 patients. This algorithm could be used to differentiate SILs and normal 
squamous tissues with an average sensitivity and specificity of 91%±2 and 78%±3, 
respectively. A limitation however is that spectra of normal columnar tissues and 
inflammation were indistinguishable from those of SILs at this single excitation 
wavelength. Furthermore, a multivariate statistical: algorithm based solely on spectra 
at 337 nm excitation could not discriminate between HG SILs and LG SELs 
effectively. 

However, multivariate statistical analysis of cervical tissue fluorescence 
spectra acquired in vivo at 380 nm and 460 nm excitation from a subset of the 92 
patients indicated that spectra at these excitation wavelengths can overcome the 
limitations of spectra at 337 nm excitation. Spectra at 380 nm excitation from 165 
sites in a first group of 40 patients could be used to differentiate SJLs from normal 
columnar epithelia and inflammation with a sensitivity and specificity of 77%± 1 and 
72%±9, respectively; spectra at 460 nm excitation from 149 sites in a second group of 
24 patients could be used to differentiate HG SILs from LG SILs with a sensitivity 
and specificity of 80%±4 and 76%±5, respectively. 

The results from previous clinical studies suggested that an algorithm based on 
normalized, mean-scaled spectra at 337 nm excitation may be used to differentiate 
between SILs and normal squamous tissues, while an algorithm based on similarly 
pre-processed spectra at 380 nm excitation may be used to differentiate SILs from 
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normal columnar tissues and samples with inflammation. Finally, a third algorithm 
based on normalized tissue spectra at 460 nm excitation may be used to discriminate 
between LG SILs and HG SILs. These results suggest that (1) a composite screening 
algorithm based on a combination of the first two constituent algorithms may be used 
5 to differentiate between SILs and non SILs (normal epithelia and inflammation) and 
(2) a composite diagnostic algorithm which combines all three constituent algorithms 
may be used to differentiate HG SLLs from non HG SILs (LG SILs, normal tissues and 
inflammation). 

The primary goal of the clinical study described in this Example was to 

10 evaluate the accuracy of constituent and composite algorithms which address certain 
limitations of previous clinical studies. Fluorescence spectra acquired in vivo at all 
three excitation wavelengths from 381 cervical sites in 95 patients were analyzed to 
determine if the accuracy of each of the three constituent algorithms previously 
developed may be improved using tissue speqtta^at a combination of two or three 

15 excitation wavelengths rather than at a single excitation wavelength. A second goal of 
the analysis was to integrate the three independently developed constituent algorithms 
that discriminate between pairs of tissue types into composite screening and diagnostic 
algorithms that may achieve discrimination between .many of the clinically relevant 
tissue types. The effective accuracy of a composite screening algorithm for the 

20 identification of SILs and a composite diagnostic algorithm for the identification of 
HG SILs was evaluated. 

The final goal of the analysis was to determine if fluorescence intensities at a 
reduced number of excitation-emission wavelength pairs may be used to re-develop 
constituent and composite algorithms that may achieve classification with a minimum 

25 decrease in predictive ability. A significant reduction in the number of required 
fluorescence excitation-emission wavelength pairs may enable the development of a 
cost-effective clinical fluorimeter. The accuracy, of the constituent and composite 
algorithms based on the reduced emission variables was compared to the accuracy of 
those that utilize entire fluorescence emission spectra. 
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Instrumentation 

A schematic of the portable fluorimeter which was used to acquire cervical 
tissue fluorescence spectra at three excitation wavelengths is shown in FIG. 23(a). The 
fiber-optic probe (Valdor Fiber Optics, VSC/FER/4SMA-1/7-BUN) included a central 
fiber surrounded by a circular array of six fibers; all seven fibers having the same 
characteristics (0.22 NA, 200 um core diameter, 245 |im diameter with cladding). 
Three fibers along the diameter of the distal end of the probe (FIG. 23(b)) were used 
for excitation light delivery. The purpose of the remaining four fibers was to collect 
the emitted fluorescence from the area directly iiluminated by the probe. A quartz 
shield (3 mm in diameter and 2 mm thick) at the tip of the distal end of the probe that 
is in direct tissue contact (FIG. 23(c)) provided a fixed distance between the optical 
fibers and the tissue surface so fluorescence intensity can be measured in calibrated 
units. 

An area, 1 mm in diameter was illuminated by each excitation fiber. The 
overlap of the iUumination area viewed by the three excitation fibers and the four 
collection fibers was approximately 80% at the outer surface of the quartz shield. Note 
that the central excitation fiber has four adjacent collection fibers whereas the two 
excitation fibers in the periphery of the probe have only two adjacent collection fibers 
(FIG. 23(b)). However, due to the large overlap, of the optical fibers at the outer face 
of the quartz shield, this difference in the excitation-emission configuration relates 
only to a small difference in the collection efficiency of the fluorescence generated 
due to excitation delivered by the central and peripheral excitation fibers. The 
difference in collection efficiency is accounted for. by normalizing tissue fluorescence 
spectra to the peak fluorescence intensity of a Rhodamine 610 calibration standard 
measured using the same probe configuration. 

Two nitrogen pumped-dye lasers (laser characteristics: 5 ns pulse duration, 30 
Hz repetition rate) (Laser Photonics, LN300C) were used to provide illumination at 
three different excitation wavelengths: one laser served to deliver excitation light at 
337 nm (fundamental) and had a dye module which was used to generate light at 380 
nm using the fluorescent dye, BBQ (1E-03 M in 7 parts toluene and 3 parts ethanol). 
The dye module of the second laser was used to provide illumination at 460 nm. using 
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the fluorescent dye, Coumarin 460 (1E-02 M in ethanol). Laser illumination at each 
excitation wavelength, 337, 380 and 460 nm was coupled into each of the three 
excitation fibers of the probe. Note that two 10 nm bandpass filters, one centered at 
380 nm and the other centered at 460 nm were placed between the excitation fiber and 
5 the dye module used to generate illumination at 380 and 460 nm, respectively to 
prevent leakage from the fundamental at 337 nm. In this Example, the average fluence 
per pulse at 337, 380 and 460 nm excitation were 15.2, 11.5 and 18 nJ/mm\ 
respectively. The pulse energy at 337 nm excitation was intentionally reduced so that 
the measured fluorescence signal did not exceed thef dynamic range of the detector. 

10 The proximal ends of the four collection fibers were arranged in a circular 

array and imaged at the 500 jam wide entrance slit of a f/3.8 spectrograph equipped 
with a 300 ln/mm grating (Jarrell Ash, Monospec 18) coupled to a 1,024 intensified 
diode array controlled by a multi-channel analyzer (Princeton Instruments, OMA). 
The collection optics between the proximal end of the four emission collection fibers 

15 and the polychromator included two quartz piano convex lenses. Between these lenses 
was a filter wheel assembly containing long pass filters with 50% transmission at 360 
(GG360), 400 (GG400) and 475 (GG475) nm which are used to block scattered 
■ excitation light at 337, 380 and 460 nm excitation, respectively from the detector. The 
purpose of the filter wheel was to position the appropriate long pass filter in the 

20 optical path during fluorescence measurements at each excitation wavelength. The 
nitrogen pumped-dye lasers were used to externally trigger a pulser (Princeton 
Instruments, PG200) which served to synchronize the 200 ns collection gate of the 
detector to the leading edge of the laser pulse. The gating of the detector eliminated 
the effects of the colposcopy white light .illumination during fluorescence 

25 measurements. Data acquisition was computer controlled. 

Clinical measurements 

A randomly selected group of non-pregnant patients referred to the colposcopy 
clinic of the University of Texas MD Anderson Cancer Center on the basis of 
abnormal cervical cytology was asked to participate in the in vivo fluorescence 
30 spectroscopy study. Informed consent was obtained from each patient who 
participated and the study was reviewed and approved by the Institutional Review 
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Boards of the University of Texas, Austin and the University of Texas, MD Anderson 
Cancer Center. Each patient underwent a 1 complete history and a physical 
examination including a pelvic exam, a Pap smear and colposcopy of the cervix, 
vagina and vulva After colposcopic examination of the cervix, but before tissue 
biopsy, fluorescence spectra were acquired on average from two colposcopically 
abnormal sites, two colposcopically normal squamous sites and 1 normal columnar 
site (if colposcopically visible) from each patient Tissue biopsies were obtained only 
from abnormal sites after they had been identified by colposcopy and then analyzed by 
the probe. Tissue biopsies were not obtained from normal squamous or columnar sites 
analyzed by the probe to comply with routine patient care procedure. All tissue 
biopsies were fixed in formalin and submitted for histologic examination. 
Hemotoxylin and eosin stained sections of each biopsy specimen were evaluated by a 
panel of four board certified pathologists and a consensus diagnosis was established 
using the Bethesda classification system. This classification system which has 
previously been used to grade cytologic specimens has now been extended to 
classification of histology samples. Samples, were classified as normal squamous, 
normal columnar, inflammation, LG SIL or H&SJU Samples with multiple diagnoses 
were classified into the most severe histo-pathologic category. 

Prior to each patient study, the probe was. disinfected and a background 
spectrum was acquired at all three excitation wavelengths consecutively with the 
probe dipped in a non-fluorescent botde containing distilled water. The background 
spectrum indicated no fluorescence due to optical components of the fluorimeter or 
the disinfectant and was subtracted from all subsequently acquired spectra at 
corresponding excitation wavelengths for that patient. Next, with the probe placed on 
the face of a quartz cuvette containing a solution of Rhodamine 610 dissolved in 
ethylene glycol (2 mg/L), 50 fluorescence spectra were measured at each excitation 
wavelength. After calibration, fluorescence spectra were acquired from the cervix: 10 
spectra for 10 consecutive pulses were acquired at 337 nm excitation; next, 50 spectra 
for 50 consecutive laser pulses were measured at 380 nm excitation and then at 460 
nm excitation. The data acquisition time was 0.33 s at 337 nm excitation and 1.67 s at 
each 380 and 460 nm excitation per cervical site. The time required to switch between 
the two nitrogen pumped-dye lasers and the three long pass filters was approximately 
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5 s. Hence, the total time required to record fluorescence emission spectra at. all three 
excitation wavelengths from one cervical site was approximately 10 s. Spectra were 
collected in the visible region of the electromagnetic spectrum with a resolution of 10 
nm (full width at half maximum) and a signal to noise ratio of 100:1 at the 
fluorescence maximum at each excitation wavelength. 

All spectra were corrected for the non-uniform spectral response of the 
detection system using correction factors obtained by recording the spectrum of an 
N.LS.T traceable calibrated tungsten ribbon filament lamp; Spectra from each cervical 
site at each excitation wavelength were averageid to obtain a single spectrum per site. 
The fluorescence spectra obtained at each excitation wavelength from the Rhodamine 
610 calibration standard were also averaged to obtain a single spectrum per excitation 
wavelength. The average tissue spectra were then normalized to the average peak 
fluorescence intensity of the Rhodamine 610 calibration standard at the corresponding 
excitation wavelength for that patient; absolute fluorescence intensities are reported in 
these calibrated units. In this clinical study, fluorescence spectra were acquired at all 
three excitation wavelengths from each cervical site from a total of 381 sites in 95 
patients during colposcopy. 

Development of screening and diagnostic algorithms 

FIG. 24 illustrates a schematic of the formal analytical process used to develop 
screening and diagnostic algorithms for the differential detection of SILs, in vivo. In 
FIG. 24, the text in the dashed-line boxes represent the mathematical steps 
implemented on the spectral data, and the text in the solid-line boxes represent the 
output after each mathematical process. There are four primary steps involved in the 
multivariate statistical analysis of tissue spectral data (FIG. 24). The first step is to 
pre-process spectral data to reduce inter-patient and intra-patient variation within a 
tissue type; the pre-processed spectra are then dimensionally reduced into an 
informative set of principal components that describe most of the variance of the 
original spectral data set using Principal Component Analysis (PCA). Next, the 
principal components that contain diagnostically relevant information are selected 
using an unpaired, one-sided student's t-test, and finally a classification algorithm 
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based on logistic discrimination is developed using these diagnostically relevant 
principal components. 

In summary, three constituent algorithms were developed using multivariate 
statistical analysis (Fig. 24): constituent algorithm (1) discriminates between SILs and 
normal squamous tissues, constituent algorithm (2) discriminates between SILs and 
normal columnar tissues and finally, algorithm ^3) differentiates HG SILs from LG 
SILs. The three constituent algorithms were then combined to develop two composite 
algorithms (Fig. 24): constituent algorithms (1) and (2) were combined to develop a 
composite screening algorithm which discriminates between SILs and non SILs. All 
three constituent algorithms were then combined to develop a composite diagnostic 
algorithm which differentiates HG SILs from non HG SILs. 

Multivariate statistical analysis of cervical tissue spectra 

As a first step, three methods of pre-processing were applied to the spectral 
data at each excitation wavelength: (1) normalization (2) mean-scaling and (3) a 
combination of normalization and mean-scaling. Similarly pre-processed spectra at 
each excitation wavelength were combined to create spectral inputs at the following 
combinations of excitation wavelengths: (337,' 460) nm, (337, 380) nm, (380, 460) nm 
and (337, 380, 460) nm. Pre-processing of spectral data resulted in four types of 
spectral inputs (original and three types of pre-processed spectral inputs) at three 
single excitation wavelengths and at four possible combinations of multiple excitation 
wavelengths. Hence, there were a total of 12 spectral inputs at single excitation 
wavelengths and 16 spectral inputs at multiple excitation wavelengths which were 
evaluated using the multivariate statistical algorithm. 

Prior to PCA, the input data matrix, D (r x c) was created so each row of the 
matrix corresponded to the pre-processed fluorescence spectrum of a sample and each 
column corresponded to the pre-processed fluorescence intensity at each emission 
wavelength. Spectral inputs at multiple excitation wavelengths were created by 
arranging spectra at each excitation wavelength in series in the original spectral data 
matrix. PCA was used to dimensionally reduce the pre-processed spectral data matrix 
into a smaller orthogonal set of linear combinations of the emission variables that 
account for most of the variance of the spectral data set. 
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Average values of principal component scores were calculated for each 
principal component of each tissue type. An unpaired, one-sided student's t-test was 
employed to determine the diagnostic content of each principal component. The 
hypothesis that the means of the principal component scores of two tissue types are 
different was tested for (1) normal squamous epithelia and SILs, (2) normal columnar 
epithelia and SILs and (3) inflammation and SILs. The west was extended a step 
further to determine if there were any statistically significant differences between the 
means of the principal component scores of HG SILs and LG SILs. Principal 
components for which the hypothesis stated above was statistically significant (P < 
0.05) were retained for further analysis. 

Next, a statistical classification algorithm was developed using the 
diagnostically useful principal components to calculate the posterior probability that 
an unknown sample belongs to each tissue type under consideration. The posterior 
probability of an unknown sample belonging to each tissue . type was calculated using 
logistic discrimination. The posterior probability is related to the prior and conditional 
joint probabilities and to the costs of misclassification of the tissue types under 
consideration. The prior probability of each tissue type was determined by calculating 
the observed proportion of cases in each group. The cost of misclassification of a 
particular tissue type was varied from 0 to 1 in 0T increments, and the optimal cost 
was identified when the total number of rhisclassified samples based on the 
classification algorithm was a minimum. If there was more than one cost at which the 
total number of misclassified samples was a minimum, the cost that maximized 
sensitivity was selected. The conditional joint probabilities were developed by 
modeling the probability distribution of each principal component of each tissue type 
using the normal probability density function, which is characterized by u (mean) and 
a (standard deviation). The best fit of the normal probability density function to the 
probability distribution of each principal component (score) of each tissue type was 
obtained in the least squares sense, using n and a as free parameters of the fit. The 
normal probability density function was then used to calculate the conditional joint 
probability that an unknown sample, given that it is from tissue type i, will exhibit a 
set of principal component scores, X. 
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The multivariate statistical algorithm was developed and optimized using a 
calibration set and then tested on a prediction set of approximately equal prior 
probability (Table 1). The purpose of testing the algorithm on the prediction set was to 
determine (1) an unbiased estimate of the algorithm's classification accuracy and (2) if 
5 the number of sample spectra within each category in the calibration set is sufficient 
to describe the spectral data in the prediction set. The calibration and prediction sets 
were developed by randomly assigning the spectral data into the two sets with the 
condition that both contain roughly equal number of samples from each histo- 
pathologic category. The random assignment ensured that not all spectra from a single 
10 patient were contained in the same data set. 

Development of constituent algorithms 

The multivariate statistical algorithm was developed and optimized using all 
28 types of pre-processed spectral inputs from the calibration set. The algorithm was 
used to identify spectral inputs which provide the, greatest discrimination between the 

15 following pairs of tissue types: (1) SILs and normal squamous epithelia, (2) SILs and 
normal columnar epithelia, (3) SILs and inflammation, and (4) HG SILs and LG SILs. 
The optimal spectral input for differentiating between two particular tissue types was 
identified when the total number of samples misclassified from the calibration set 
using the multivariate statistical algorithm was a minimum. The algorithm based on 

20 the spectral input that minimized misclassification between the two tissue types under 
consideration was implemented on the prediction data set. 

Three multivariate statistical constituent algorithms were developed using 
tissue spectra at three excitation wavelengths. Constituent algorithm (1) was 
developed to differentiate between SILs and normal squamous epithelia; constituent 

25 algorithm (2) was developed to differentiate between SILs and normal columnar 
epithelia and constituent algorithm (3) could be used to discriminate between LG SILs 
and HG SILs. A constituent algorithm which can discriminate between SILs and 
tissues with inflammation could not be developed using spectral data from the current 
clinical study. 
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Development of composite algorithms 

Each of the independently developed constituent algorithms was intended to 
discriminate only between pairs of tissue types. A combination of these constituent 
algorithms was required to provide discrimination between several of the clinically 
relevant tissue types. Therefore, two composite algorithms were developed: a 
composite screening algorithm was developed to differentiate between SILs and non 
SILs (normal squamous and columnar epithelia and inflammation) using constituent 
algorithms (1) and (2) and a composite diagnostic algorithm was developed to 
differentiate HG SILs from non HG SILs (LG SILs, normal epithelia and 
inflammation) using all three constituent algorithms. 

The composite screening algorithm was developed in the following manner. 
First, constituent algorithms (1) and (2) were developed independently using the 
calibration data set. The classification outputs from both constituent algorithms were 
used to determine if a sample being evaluated is SIL or non SIL: first, using 
constituent algorithm (1), samples were classifiedas non SIL if they had a probability 
that is less than 0.5; otherwise, they were classified as SIL. Next, only samples that 
were classified as SIL based on the algorithm (1) were tested using algorithm (2). 
Again, samples were classified as non SIL if their posterior probability was less than 
0.5; otherwise they were classified as SIL. The spectral data from the prediction set 
was evaluated using the composite screening algorithm in an identical manner. 

The composite diagnostic algorithm was implemented in the following 
manner. The three constituent algorithms were developed independently using the 
calibration set. Algorithms (1) and (2) were implemented on each sample from the 
calibration data set, as described previously. Only samples that were classified as SIL 
based on algorithms (1) and (2) were tested using algorithm (3). If samples evaluated 
using algorithm (3) had a posterior probability greater than 0.5, they were classified as 
HG SIL; otherwise they were classified as non HG SIL. The spectral data from the 
prediction set was evaluated using the composite diagnostic algorithm in an identical 
manner. 
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Results 

Constituent algorithms (1), (2) and (3) 

Table 2 summarizes the components of the optimal set of three constituent 
algorithms. Constituent algorithm (1) can be used to differentiate between SILs and 
normal squamous epithelia; algorithm (2) differentiates between SILs and normal 
columnar epithelia and algorithm (3) discriminates between LG SILs and HG SILs. 
Pre-processing 

FIG. 25(a) illustrates average fluorescence spectra per site acquired from 
cervical sites at 337 nm excitation from a typical , patient. All fluorescence intensities 
are reported in the same set of calibrated units. Corresponding normalized and 
normalized, mean-scaled spectra are illustrated in FIG. 25(b) and 25(c), respectively. 
Evaluation of the original spectra at 337 nm excitation (Fig. 25(a)) indicates that the 
fluorescence intensity of SILs is less than that of the corresponding normal squamous 
tissue and greater than that of the corresponding normal columnar tissue over the 
entire emission spectrum. Examination of normalized spectra from this patient (Fig. 
25(b)) indicates that following normalization, the fluorescence intensity of the normal 
squamous tissue is greater than that of corresponding Slipover the wavelength range 
360 to 450 nm only; between 460 and 600 nm, the fluorescence intensity of SBLs is 
greater than that of the corresponding normal squamous tissue which in part reflects 
the longer peak emission wavelength of SILs. A comparison of the spectral line shape 
of SILs to that of the normal columnar tissue illustrates the opposite phenomenon. The 
normalized fluorescence intensity of SILs is greater than that of the corresponding 
normal columnar tissue over the wavelength range 360 to 450 nm; however, between 
460 and 600 nm, the fluorescence intensity of the normal columnar tissue is greater 
than that of the SILs; this spectral difference reflects the longer peak emission 
wavelength of the normal columnar tissue relative to that of SBLs. Further evaluation 
of normalized spectra in Fig. 25(b) indicates that there are spectral line shape 
differences between LG SILs and HG SILs over the wavelength range 360 to 420 nm. 

The corresponding normalized, mean-scaled spectra of this patient, shown in 
Fig. 25(c) displays differences in the normalized fluorescence spectrum (Fig. 25(b)) 
from a particular site with respect to the average normalized spectrum (the average of 
all normalized spectra obtained from this patient). As the average normalized 
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spectrum has been subtracted from each normalized spectrum obtained from this 
patient, the mean now lies at Y=0 over the entire emission wavelength range. 
Evaluation of Fig. 25(c) indicates that between 360 and 450 nm, the normalized, 
mean-scaled fluorescence intensity of the normal squamous tissue is greater than the 
5 mean, and that of the normal columnar tissue is less than the mean. Above 460 nm, 
the opposite phenomenon is observed; the fluorescence intensity of the normal 
squamous tissue is less than the mean, while that of the normal columnar tissue is 
greater than the mean. The fluorescence intensity of SILs lies close to the mean and is 
bounded by the intensities of the two normal tissue types. In addition, between 360 

10 and 420 nm, the normalized, mean-scaled fluorescence intensity of the LG STL is 
slightly greater than the mean, while that of the HG SIL is less than the mean. 

FIG. 26(a) illustrates average fluorescence spectra per site acquired from 
cervical sites at 380 nm excitation, from the same patient. FIG. 26(b-c) show the 
corresponding normalized, and normalized, mean-scaled. spectra, respectively. In Fig. 

15 26(a), the fluorescence intensity of SILs is less than that of the corresponding normal 
squamous tissue, with the LG SIL exhibiting the weakest fluorescence intensity over 
the entire emission spectrum. Note that the fluorescence intensity of the normal 
columnar sample is indistinguishable from that of the HG SIL. Normalized spectra at 
380 nm excitation, (26(b)), indicate that over the wavelength range 400 to 450 nm, the 

20 fluorescence intensity of the normal squamous; tissue is slightly greater than that of 
SILs and that of the normal columnar tissue is Jess than that of SILs. The opposite 
phenomenon is observed above 580 nm. A careful examination of the spectra of the 
LG SDL and HG SIL indicates that between , : 460 and 580 nm, the normalized 
fluorescence intensity of the LG SIL is higher than that of the HG SIL. The 

25 normalized, mean-scaled spectra (Fig. 26(c)) enhances the previously observed 
normalized spectral line shape differences by displaying them relative to the average 
normalized spectrum of this patient. Fig. 26(c) indicates that between 400 to 450 nm, 
the fluorescence intensity of the normal squamous tissue is greater than the mean and 
that of the normal columnar tissue is less than the mean. The opposite phenomenon is 

30 observed above 460 nm. The fluorescence intensity of the SILs is bounded by the 
intensities of the two normal tissue types over the entire emission spectrum. The LG 
SIL and HG SIL also show spectral line shape differences; above 460 nm, the 
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normalized, mean-scaled fluorescence intensity of the LG SIL lies above the mean and 
that of the HG SIL lies below the mean. 

FIG. 27(a-c) illustrate original, normalized and normalized, mean-scaled 
spectra, respectively at 460 nm excitation from the same patient. Evaluation of Fig. 
27(a) indicates that the fluorescence intensity of SILs is less than that of the 
corresponding normal squamous tissue and greater than that of the corresponding 
normal columnar sample over the entire emission spectrum. Evaluation of normalized . 
spectra at this excitation wavelength (Fig. 27(b)) demonstrates that below 510 nm, the 
fluorescence intensity of SILs is less than that of the/normal squamous tissue and 
greater than that of the corresponding normal columnar tissue. Above, 580 nm, the 
normalized fluorescence intensity of SBLs is less than that of the normal columnar 
tissue and greater then that of normal squamous tissue. Note that there are spectral 
line shape differences between the LG SIL and HG SIL between 580 and 660 nm; the 
normalized fluorescence intensity of the LG SEL, is, greater than that of the HG SIL. 
The normalized, mean-scaled spectra shown in Fig. 27(c) reflects the differences 
observed in the normalized spectra relative to the;average normalized spectrum of this 
patient. Below 510 nm, the fluorescence intensity of the normal squamous tissue is 
greater than the mean, while that of the normal columnar tissue is less than the mean. 
Above 580 nm, the opposite phenomenon is observed. The fluorescence intensity of 
the SILs lies between those of the two normal - tissue types. Above 580 nm, the 
fluorescence intensity of the LG SIL is greater than the mean and that of the HG SIL is 
less than the mean. , ; 

Principal Component Analysis and Logistic Discrimination 

Constituent algorithm (1) which differentiates SILs from normal squamous tissues 

A constituent algorithm based on normalized spectra arranged in series at all 
three excitation wavelengths provided the greatest discrimination between SILs and 
normal squamous tissues. The algorithm demonstrated an incremental improvement in 
sensitivity without sacrificing specificity relative to the previously developed 
constituent algorithm (1) that employed normalized, mean-scaled spectra at 337 nm 
excitation only. Multivariate statistical analysis of normalized tissue spectra at all 
three excitation wavelengths, indicated three principal components show statistically 



WO 99/57529 PCT/US99/D9768 

53 

significant differences between SILs and normal squamous tissues (Table 2). These 
three principal components account collectively for 65% of the total variance of the 
spectral data set. Logistic discrimination was used to develop a classification 
algorithm to discriminate between SILs and normal squamous epithelia based on these 
5 three informative principal components. Prior probabilities were determined by 
calculating the percentage of each tissue type from the data set: 62% normal 
squamous tissues and 38% SILs. The cost of misclassification of SIL was optimized at 
0.7. Posterior probabilities of belonging to each tissue type were calculated for all 
samples from the data set, using the known prior probabilities, cost of 

10 misclassification of SILs and the conditional joint probabilities calculated from the 
normal probability density function. FIG. 28 illustrates the retrospective accuracy of 
the algorithm applied to the calibration data set. The posterior probability of being 
classified into the SIL category is plotted for all SILs and normal squamous epithelia. 
FIG. 28 indicates that 92% of HG SILs and 81% of LG SILs are correctly classified 

15 with a posterior probability greater than 0.5. Approximately 70% of colposcopically 
normal squamous epithelia are correctly classified with a posterior probability less 
than 0.5. . 

The confusion matrix in Table 3 compares the retrospective accuracy of the 
algorithm on the calibration data set to its prospective accuracy on the prediction set. 

20 In the confusion matrix, the first row corresponds tp the histo-pathologic classification 
and the first column corresponds to the spectroscopic classification of the samples. A 
prospective evaluation of the algorithm's accuracy indicates that there is a small 
increase in the proportion of correctly classified LG SILs and no change in the 
proportion of correctly classified HG SILs or normal squamous tissues. Note that the 

25 majority of normal columnar tissues and samples with inflammation from both 
calibration and prediction sets are misclassified as SIL using this algorithm. 
Evaluation of the misclassified SILs from the calibration set indicates that one sample 
(out of 19) with CIN m, two samples (out of 16) with CIN n, two samples (out of 16) 
with CIN I and two samples (out of 7) with HPV are incorrectly classified. From the 

30 prediction set, two samples (out of 19) with CIN m, one samples (out of 16) with CIN 
II, two samples (out of 16) with CIN I and one sample (out of 8) with HPV are 
incorrectly classified as non SBL 
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Constituent algorithm (2) which differentiates SILs from normal columnar tissues 

The greatest discrimination between SILs and normal columnar epithelia was 
achieved using a constituent algorithm based on normalized, mean-scaled spectra at 
all three excitation wavelengths. This algorithm demonstrated a substantially 
improved sensitivity for a similar specificity relative to the previously developed 
constituent algorithm (2) which used normalized, mean-scaled spectra at 380 nm 
excitation, only. Multivariate statistical analysis of a combination of normalized, 
mean-scaled tissue spectra at all three excitation wavelengths resulted in four 
principal components that demonstrate statistically significant differences between 
SILs and normal columnar epithelia (Table 2). These four principal components 
collectively account for 80% of the total variance of the spectral data set. Logistic 
discrimination was employed to develop a classification algorithm to discriminate 
between SILs and normal columnar epithelia. The prior probabilities were determined 
to be: 28% normal columnar tissues and 72% SILs. The optimized cost of 
Declassification of SIL was equal to 0.58. Posterior probabilities of belonging to each 
tissue type were calculated for all samples from the data set. FIG. 29 illustrates the 
retrospective accuracy of the algorithm applied to the calibration data set. The 
posterior probability of being classified into the SIL category is plotted for all SBLs 
and normal columnar samples examined. FIG. 29 graphically indicates that 91% of 
HG SILs and 83% of LG SDLs have a posterior probability that is greater than 0.5. 
Seventy-six percent of colposcopically normal ; columnar epithelia are correctly 
classified with a posterior probability less than 0.5. 

The confusion matrix in Table 4 compares the retrospective accuracy of the 
constituent algorithm on the calibration data set to its prospective accuracy on the 
prediction set. The prospective accuracy of the algorithm (Table 4) indicates that there 
is a small increase in the proportion of correctly classified LG SILs and a small 
decrease in the proportion of correctly classified HG SILs; there is approximately a 
10% decrease in the proportion of correctly classified normal colunlnar tissues. Note 
that the majority of normal squamous tissues and samples with inflammation from 
both the calibration and prediction sets are misclassified as SIL using this algorithm. 
Evaluation of the misclassified SILs from the calibration set indicates that three 
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samples (out of 16) with ON D, three samples (out of 16) with CIN I and one sample 
(out of 7) with HPV are incorrectly classified From the prediction set, two samples 
(out of 19) with CIN m, three samples (out of 16) with CIN II, and three samples (out 
of 16) with CIN I are incorrectly classified. 

Constituent algorithm (3) which differentiates HG SILs and LG SILs 

A combination of normalized spectra at all three excitation wavelengths 
significantly enhanced the accuracy of the previously developed constituent algorithm 
(3) which differentiated HG SILs from LG SILs using normalized spectra at 460 nm 
excitation. Multivariate statistical analysis of normalized spectra at all three excitation 
wavelengths resulted in four statistically significant principal components, that 
account collectively for 67% of the total variance of the spectral data set (Table 2). 
Again, a probability based classification algorithm was developed to differentiate HG 
SILs from LG SILs. The prior probability was: 40% LG SILs and 60% HG SILs. The 
optimal cost of misclassification of HG SIL was equal to 0.51. Posterior probabilities 
of belonging to each tissue type were calculated- FIG. 30 illustrates the retrospective 
accuracy of the algorithm applied to the calibration data set. The posterior probability 
of being classified into the HG SIL category is plotted for all SILs evaluated. Fig. 30 
indicates that 83% of HG SILs have a posteriospxobability greater than 0.5, and 70% 
of LG SILs have a posterior probability less than 05. 

The confusion matrix in Table 5 compares the retrospective accuracy of the 
constituent algorithm on the calibration set to its prospective accuracy on the 
prediction set. Its prospective accuracy indicates that there is a 5% decrease in the 
proportion of correctly classified LG SILs and no change in the proportion of correcdy 
classified HG SILs. From the calibration set, six HG SILs are misclassified; three 
samples (out 19) with CIN ID and three samples (out of 16) with CIN II are 
misclassified as LG SIL. The misclassified LG SILs comprise of five samples (out of 
16) with CIN I and two samples (out of 7) with HPV. From the prediction set, five HG 
SILs are misclassified; two samples (out of 19) with CIN m and three (out of 16) with 
CIN H. There were ten misclassified LG SILs from the prediction set: seven with CIN 
I (out of 16) and three (out of 8) with HPV. 
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"Full-parameter" composite screening and diagnostic algorithms 

A composite screening algorithm was developed to differentiate SILs and non 
SILs (normal squamous and columnar epithelia and inflammation) and a composite 
diagnostic algorithm was developed to differentiate HG SILs from non HG SILs (LG 
SILs, normal epithelia and inflammation). The effective accuracy of both composite 
algorithms were compared to those of the constituent algorithms from which they 
were developed and to the accuracy of current detection modalities. 

A composite screening algorithm which discriminates between SILs and non SILs 

A composite screening algorithm to differentiate SILs from non SILs was 
developed using a combination of the two constituent algorithms: algorithm (I) which 
differentiates SILs from normal squamous tissues and algorithm (2) which 
differentiates SILs from normal columnar epithelia. The optimal cost of 
miclassification of SIL was equal to 0.66 for constituent algorithm (1) and 0.64 for 
constituent algorithm (2). Only the costs of misclassification of SIL of the two 
constituent algorithms was altered for the development of the composite screening 
algorithm. These costs were selected to minimize the total number of misclassified 
samples. 

The accuracy of the composite screening algorithm on the calibration and 
prediction data sets is illustrated in the confusion matrix in Table 6. Examination of 
the confusion matrix indicates that the algorithm correctly classifies approximately 
90% of HG SILs and 75% of LG SIL from thfe calibration data set. Furthermore, 
approximately, 80% of normal squamous tisjsues and 70% of normal columnar 
epithelia from the calibration set are correctly classified. Evaluation of the prediction 
set indicates that there is a small change in the proportion of correctly classified HG 
SILs and LG SILs. There is a negligible change in the correct classification of normal 
squamous and columnar tissues. Note that while , 80% of samples with inflammation 
from the calibration set are incorrectly classified ,as SDL, only 43% of these samples 
from the prediction set are incorrectly classified. 

A comparison of the accuracy of the composite screening algorithm (Table 6) 
to that of each of the constituent algorithms (1) (Table 3) and (2) (Table 4) on the 
same spectral data set indicates that in general, there is less than a 10% decrease in the 
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proportion of correctly classified SILs using the composite screening algorithm 
relative to using either of the constituent algorithms independently. Note however that 
the proportion of correctly classified normal (squamous and columnar) epithelia is 
substantially higher using the composite algorithm relative to using either of the 
constituent algorithms independently. These results confirm that utilization of a 
combination of the two constituent algorithms, significantly reduces the false-positive 
rate relative to that using each algorithm independently. Evaluation of the 
spectroscopically misclassified SILs from the calibration set (Table 6) indicates that 
only one sample (out of 19) with CIN m, three samples (out of 16) with CIN II, two 
samples (out of 16) with CIN I and four samples (out of 7) with HPV are incorrectly 
classified. From the prediction data set (Table 6), two samples (out of 19) with CIN 
HI, four samples (out of 16) with CIN n, three samples (out of 16) with CIN I and one 
sample (out of 8) with HPV are incorrecUy classified. 

A composite diagnostic algorithm which differentiates HG SILs from non HG SILs 

A composite diagnostic algorithm which differentially detects HG SILs was 
developed using a combination of all three constituent algorithms: algorithm (1) 
which differentiates . SILs from normal squamous tissues, algorithm (2) which 
differentiates SILs from normal columnar epithelia and algorithm (3) which 
differentiates HG SILs from LG SILs. The optimal costs of miclassification of SIL 
was equal to 0.87 for algorithm (1) and 0.65 for algorithm (2); the optimal cost of 
misclassification of HG SIL was equal to 0.49 for algorithm (3). Only the costs of 
misclassification of SIL of constituent algorithms (1) and (2) and the cost of 
misclassification of HG SIL of constituent algorithm (3) were altered during 
development of the composite diagnostic algorithm. These costs were selected to 
minimize the total number of misclassified samples. 

The results of the composite diagnostic algorithm on the calibration and 
prediction sets are shown in the confusion matrix in Table 7. The algorithm correctly 
classifies 80% of HG SILs, 74% of LG SILs and more than 80% of normal epithelia. 
Evaluation of the prediction set using this composite algorithm indicates that there is 
only a 3% decrease in the proportion of conectly classified HG SILs and a 7% 
decrease in the proportion of correctly classified LG SELs. There is less than a 10% 
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decrease in the proportion of correctly classified normal epithelia. A comparison 
between the calibration and prediction sets indicates that while more than 70% of 
samples with inflammation from the calibration data set are incorrectly classified as 
HG SIL, only 14% of samples with inflammation from the prediction set are 
5 incorrectly identified. Due to the relatively small number of samples examined in this 
histopathologic category, the results presented here do not conclusively establish if 
the algorithm is capable of correctly identifying inflammation. 

A comparison of the accuracy of the composite diagnostic algorithm to that of 
constituent algorithm (3) which differentiates HG SBLs from LG SELs (Table 5) 

10 indicates there is less than a 5% decrease in the proportion of correctly classified HG 
SILs and a 5% increase in the proportion of correctly classified LG SILs using the 
composite diagnostic algorithm relative to using the constituent algorithm (3). 
Evaluation of the HG SILs from the calibration set (Table 7) that were incorrectly 
classified indicates that three samples (out of 19) \yith CJN ffl and four samples (out 

15 of 16) with CIN II are incorrectly classified. From the prediction set, four samples (out 
of 19) with CIN HI and five samples (out of 16) with CIN H are incorrectly classified. 

"Reduced-parameter" composite screening and diagnostic algorithms 

Component Loadings: A component loading represents the correlation between each 
principal component and the original pre-processed fluorescence emission spectra at a 

20 particular excitation wavelength. FIG. 31(a-c) illustrate component loadings of the 
diagnostically relevant principal components of constituent algorithm (1) obtained 
from normalized spectra at 337, 380 and 460 nm excitation, respectively. FIG. 32(a-c) 
display component loadings that correspond to the diagnostically relevant principal 
components of constituent algorithm (2) obtained from normalized, mean-scaled 

25 spectra at 337, 380 and 460 nm excitation, respectively. Finally, FIG. 33(a-c) display 
the component loadings corresponding to the diagnostically relevant principal 
components of constituent algorithm (3), obtained from normalized spectra at 337, 
380 and 460 nm excitation, respectively. In each graph shown, the abscissa 
corresponds to the emission wavelength range at a particular excitation wavelength 

30 and the ordinate corresponds to the correlation coefficient of the component loading. 
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Correlation coefficients of the component loading above 0.5 and below -0.5 are 
considered to be significant. 

FIGS. 31(a), 32(a) and 33(a) display component loadings of principal 
components of constituent algorithms (1), (2) and (3), respectively, obtained from pre- 
5 processed spectra at 337 nm excitation. A closer examination indicates that 
component loading 1 is nearly identical for all three algorithms. Evaluation of this 
loading indicates that it is positively correlated with corresponding emission spectra 
over the wavelength range 360-440 nm and negatively correlated with corresponding 
emission spectra over the wavelength range s 460-660 nm. All remaining principal 

10 components of all three algorithms display a correlation between -0.5 and 0.5, except 
component loading 4 of algorithm (2) (Fig. 32(a)) which displays a positive 
correlation of 0.75 with the corresponding emission spectra at 460 nm. 

FIGS. 31(b), 32(b) and 33(b) display component loadings that correspond to 
the diagnostically relevant principal components of constituent algorithms (1), (2) and 

15 (3), respectively obtained from pre-processed spectra at 380 nm excitation. 
Component loading 1 of all three algorithms is positively correlated with 
corresponding emission spectra over the wavelength range, 400-450 nm. Between 
500-600 nm, component loading 1 of algorithm (2) (Fig. 32(b)) is correlated 
negatively with corresponding emission spectra.> Examination of component loading 3 

20 of algorithm (1) (Fig. 31(b)) and algorithm (3) (Fig, 33(b)) indicates that they are also 
negatively correlated with corresponding emissipn spectra from 500-600 nm. Only 
component loading 2 of algorithm (2) (Fig. ;32(b)) is positively correlated with 
corresponding emission spectra from 500-600 nir^i Also note that component loading 
3 of algorithm (1) (Fig. 31(b)) and component loadings 3 and 6 of algorithm (3) (Fig. 

25 33(b)) display a correlation with corresponding emission spectra at approximately 640 
nm. 

FIGS. 31(c), 32(c) and 33(c) display component loadings that correspond to 
the diagnostic principal components of constituent algorithms (1), (2) and (3), 
respectively obtained from pre-processed spectra at 460 nm excitation. Note that only 
30 component loading 1 displays a negative correlation (< -0.5) with corresponding 
emission spectra for all three algorithms. This component loading is correlated with 
corresponding emission spectra over the wavelength range 580-660 nm. The 
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remaining principal components of all three algorithms display a correlation between - 
0.5 and 0.5. 

The component loadings at all three excitation wavelengths of all three 
constituent algorithms were evaluated to select fluorescence intensities at a minimum 
number of excitation-emission wavelength pairs required for the previously developed 
constituent and composite algorithms to perform with a minimal decrease in 
classification accuracy. Portions of the component loadings of the three constituent 
algorithms most highly correlated (correlation > 0.5 or < -0.5) with corresponding 
emission spectra at each excitation wavelength were selected and the reduced data 
matrix was then used to regenerate and evaluate the constituent and composite 
algorithms. It was iteratively determined that fluorescence intensities at a minimum of 
15 excitation-emission wavelength pairs are required to re-develop constituent and 
composite algorithms that demonstrate a minimum decrease in classification accuracy. 
At 337 nm excitation, fluorescence intensities at two emission wavelengths between 
360-450 nm and intensities at two emission wavelengths between 460-660 nm were 
selected. At 380 nm excitation, intensities at two eniission wavelengths between 400- 
450 nm and intensities at four emission wavelengths between 500-640 nm were 
selected. Finally, at 460 nm excitation, fluorescence intensities at five emission 
wavelengths over the range 580-660 nm was selected. Table 8 lists these excitation- 
emission wavelength pairs for each of the three, constituent algorithms, (1), (2) and 
(3). These excitation-emission wavelength pairs are also indicated on the component 
loading plots in Figs. 31-33. The bandwidth at each emission wavelength is 10 nm. 

Reduced-parameter composite algorithms 

Using the fluorescence intensities only at the selected excitation-emission 
wavelength pairs, the three constituent algorithms were re-developed using the same 
formal analytical process as was done previously using the entire fluorescence 
emission spectra at all three excitation wavelengths (Fig. 24). The three constituent 
algorithms were then independently optimized using the calibration set and tested 
prospectively on the prediction data set. They were combined as described previously 
into composite screening and diagnostic algorithms. The effective accuracy of these 
reduced-parameter composite algorithms were compared to that of the full-parameter 



WO 99/57529 PCI7US99/09768 

61 

composite algorithms developed previously using fluorescence emission spectra at all 
three excitation wavelengths. 

Table 9 displays the accuracy of the reduced-parameter composite screening 
algorithm (based on fluorescence intensities at 15 excitation-emission wavelength 

5 pairs) which discriminates between SDLs and non SILs applied to the calibration and 
prediction sets. A comparison between the calibration and prediction data sets 
indicates that there is less than a 10% decrease in the proportion of correctly classified 
SILs and normal squamous tissues from the prediction set Note however that there is 
a 20% increase in the proportion of correctly classified normal columnar epithelia and 

10 a 40% increase in the proportion of correctly classified samples with inflammation 
from the prediction set. 

The accuracy of the reduced-parameter composite screening algorithm (Table 
9) was compared to that of the full-parameter composite screening algorithm (Table 
6) applied to the same spectral data set. A comparison indicates that in general there is 

15 less than a 10% decrease in the accuracy of the reduced-parameter composite 
algorithm relative to that of the full-parameter composite screening algorithm, except 
for a 20% decrease in the proportion of correctly classified normal columnar epithelia 
from the calibration set tested using the reduced-parameter composite screening 
algorithm (Table 9). 

20 Table 10 displays the accuracy of the reduced-parameter composite diagnostic 

algorithm that differentially identifies HG SILs from the calibration and prediction 
sets. A comparison of sample classification between the calibration and prediction 
data sets indicates that there is negligible change in the proportion of correctly 
classified HG SILs, LG SILs and normal squamous epithelia. Note that there is 

25 approximately a 20% increase in the proportion of correctly classified normal 
columnar epithelia and samples with inflammation from the prediction set. 

A comparison of the composite diagnostic algorithm based on the reduced 
emission variables (Table 10) to that using fluorescence emission spectra at all three 
excitation wavelengths (Table 7) applied to the same spectral data set indicates that in 

30 general, the accuracy of the reduced-parameter composite diagnostic algorithm is 
within 10% of that reported for the full-parameter composite diagnostic algorithm; 
however, a comparison between Tables 7 and 10 indicates that there is approximately 
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a 15% decrease and a 20% increase in the proportion of correctly classified normal 
columnar epithelia from the calibration and prediction sets (Table 10), respectively 
which were tested using the reduced-parameter composite diagnostic algorithm. The 
opposite trend is observed for samples with inflammation tested using the reduced- 
5 parameter composite diagnostic algorithm (Table 10). 

Table 1 1 compares the sensitivity and specificity of the full-parameter and 
reduced-parameter composite algorithms to that of Pap smear screening and 
colposcopy in expert hands. Table 11 indicates that the composite screening 
algorithms have a similar specificity and a significantly improved sensitivity relative 

10 to Pap smear screening. A comparison of the sensitivity of the composite screening 
algorithms to that of colposcopy in expert hands for differentiating SILs from non 
SILs indicates that these algorithms demonstrate a 10% decrease in sensitivity, but a 
20% improvement in specificity. The composite diagnostic algorithms and colposcopy 
in expert hands discriminate HG SILs from npn- HG SILs with a very similar 

15 sensitivity and specificity. Also note that the variability (standard deviation) of both 
Pap smear screening and colposcopy in expert hands is substantially higher than that 
of the full-parameter and reduced-parameter sgreejiing and diagnostic algorithms. A 
comparison between the full-parameter and reduced-parameter composite algorithms 
indicates that the algorithms based on the reduced emission variables demonstrate a 

20 minimal decrease in classification accuracy relative to those that employ fluorescence 
emission spectra at all three excitation wavelength?. 

Discussion and Conclusions 

Cervical tissue fluorescence spectra recorded at 337, 380 and 460 nm 
excitation can be used to develop composite screening and diagnostic algorithms for 

25 the differential detection of SILs in vivo. The composite screening algorithm 
discriminates between SILs and non SILs with a similar specificity and a substantially 
improved sensitivity relative to standard Pap smear screening. When compared to 
colposcopy in expert hands, the composite screening algorithm displays a 10% 
decrease in sensitivity but almost a 20% improvement in specificity. A comparison 

30 between the composite diagnostic algorithm and colposcopy in the hands of expert 
practitioners indicates that both have a very similar sensitivity and specificity for 
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discriminating between HG SILs and non HG SILs. Note that as spectroscopic 
interrogation of diseased and non-diseased cervical tissue sites in the current clinical 
study was directed by colposcopic impression, the sensitivity of the spectroscopic 
algorithms could not exceed the sensitivity of colposcopy. In other words, if there 
were histologically diseased cervical tissue sites that were overlooked by colposcopy, 
these false-negatives were not be evaluated spectroscopically. As a result, the 
potential of fluorescence spectroscopy to correctly classify these false-negatives could 
not be determined. 

The full-parameter composite algorithms were re-developed using 
fluorescence intensities at 15 excitation-emission wavelength pairs, to generate 
reduced-parameter composite algorithms. The fluorescence intensities at these reduced 
number of excitation-emission wavelength pairs were selected using a parameter 
called the component loading calculated from the principal components. Evaluation of 
the reduced-parameter composite algorithms indicates that they display a minimal 
decrease in sensitivity and specificity relative, to the full-parameter composite 
algorithms. The reduction in the number of excitation-emission wavelength pairs from 
161 to 15 implies reduction in the complexity and cost of the portable fluorimeter 
which would be used to measure cervical tissue fluorescence. For example, if 
fluorescence intensities at only 15 excitation-emission wavelength pairs need to be 
measured, the polychromator and intensified diode airay can be replaced by a 
mechanical filter assembly and a single channel detector. This represents a substantial 
decrease in cost and complexity of this instrumentation at the expense of less than a 
1 % decrease in sensitivity. 

Several significant improvements and refinements have been made in 
previously developed constituent algorithms using tissue spectra at all three excitation 
wavelengths. Previously, the constituent algorithm (1) which differentiates SILs from 
normal squamous epithelia was developed using normalized, mean-scaled spectra at a 
single excitation wavelength: 337 nm. Spectra at this excitation wavelength had to be 
mean-scaled in order to calibrate for the significant inter-patient variation in spectral 
line shape. This algorithm demonstrates the greatest classification accuracy when the 
patient being evaluated has equal numbers of diseased and non-diseased tissue sites. 
This restriction clearly reduces the clinical effectiveness of this algorithm. The new 
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algorithm which is based on normalized emission spectra at all three excitation 
wavelengths, minimizes this inter-patient variation and hence obviates the need for 
mean-scaling, while maintaining a slightly improved classification accuracy. Inclusion 
of spectra at additional excitation wavelengths represents a significant improvement in 
5 the clinical effectiveness of this algorithm as it can be applied to a much wider 
population of patients. 

The accuracy of previously developed constituent algorithm (2) which 
discriminates between SILs and normal columnar epithelia was significantly improved 
by using normalized, mean-scaled spectra at all three excitation wavelengths rather 

10 than at a single excitation wavelength. Despite the significant improvement in these 
results, this algorithm is also based on tissue spectra that require mean-scaling at each 
excitation wavelength. A multivariate statistical algorithm based on normalized 
spectra only, at all three excitation wavelengths differentiates SBLs from normal 
columnar epithelia with a significandy poorer-sensitivity than the algorithm that uses 

15 normalized, mean-scaled spectra at all three excitation wavelengths. Therefore, mean- 
scaling is essential for the optimal operation of this algorithm. 

Finally, an improvement that is significant is the development of the third 
constituent algorithm which discriminates between LG SILs and HG SILs using tissue 
spectra at all three excitation wavelengths. The utilization of spectra at all three 

20 excitation wavelengths results in a substantial improvement in sensitivity relative to 
using the constituent algorithm (3) which is based on a single excitation wavelength. 
Furthermore, spectra required for this algorithm do not have to be mean-scaled for 
inter-patient variation in spectral line shape. 

Each of the three constituent algorithms developed using spectral data from 

25 the current clinical study discriminate between a specific pair of tissue types. Using 
each constituent algorithm, a . posterior probability assignment of an unknown sample 
to a particular tissue category is calculated using a set of diagnostically relevant 
principal components that demonstrate statistically significant differences between the 
two tissue types under consideration. The posterior probability output of the 

30 constituent algorithms are then combined to develop composite screening and 
diagnostic algorithms that discriminate between many of the clinically relevant tissues 
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types. Hence, development of the two composite algorithms is based on the prior 
development of the three constituent algorithms: 

To test the feasibility of an alternate approach, the two composite algorithms 
were developed directly from diagnostically relevant principal components of their 
corresponding constituent algorithms, thereby by-passing the constituent algorithm 
development phase. The composite screening algorithm which discriminates between 
SILs and non SELs was developed using logistic discrimination based on the 
diagnostically relevant principal components of constituent algorithms (1) and (2); the 
posterior probability of an unknown sample being classified as either SIL or non SIL 
was calculated. The composite diagnostic algorithm which discriminates between HG 
SILs and non HG SILs was developed using logistic discrimination based on the 
diagnostically relevant principal components of constituent algorithms (1), (2) and (3); 
the posterior probability of an unknown sample being classified as either HG SIL or 
non HG SIL was calculated. The composite algorithms developed directly from the 
diagnostically relevant principal components of their corresponding constituent 
algorithms demonstrated a poorer classification accuracy relative to composite 
algorithms that were developed using a combination of corresponding constituent 
algorithms. Therefore, composite screening and diagnostic algorithms were developed 
using a combination of independently developed constituent algorithms. 

Pre-processing to remove inter-patient and intra-patient variation prior to the 
development of the multivariate statistical algorithm may remove the spectral 
variations that may be significant from a biological standpoint. However, in the 
development of multivariate statistical screening and diagnostic algorithms that can 
successfully identify disease in any given patient, the intra-patient and inter-patient 
spectral variations must be removed if they do obscure the important inter-category 
differences that the algorithm needs to extract. If a sophisticated physical model can 
be developed to describe the biological basis of the spectral data as well as the inter- 
patient and intra-patient spectral variations accurately, then this information can be 
used to develop better methods of pre-processing or direct the need for additional 
measurements to calibrate for these variations. This is an important issue to address 
and is currently the subject of study in our laboratory. 
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In spite of the successful development of algorithms that can differentiate (1) 
SILs from normal tissues and (2) HG SILs from non HG SILs and normal epithelia, 
these algorithms do not consistently classify samples with inflammation as non SIL; 
this results in a decrease in their specificity. Although the number of samples 
5 examined in this histo-pathologic category is limited, analysis from previous and 
current clinical studies indicates that it relatively difficult to correctly classify these 
samples. A plausible explanation for this is that (1) the current excitation wavelengths 
used may not be optimum for identification of fluorophores that are unique to 
inflammation and/or (2) the penetration depth of the light may not be sufficiently long 
10 to spectroscopically interrogate the underlying stromal layers where inflammation 
develops. 

The specificity of fluorescence spectroscopy for the detection of cervical 
neoplasia may be improved by using fluorescent photosensitizers to enhance the 
contrast between neoplastic and non-neoplastic tissues in vivo. The use of 

15 photosensitizers such as photofrin, hematoporphyrin derivative or 5- ALA may 
potentially enhance the spectroscopic differences between neoplastic and non- 
neoplastic (normal and inflammatory) cervical tissues and hence contribute to an 
improved specificity of the spectroscopic algorithms. 

Another limitation is that the portable fluprimeter described in this Example to 

20 measure in vivo tissue fluorescence spectra utilizes; a single-pixel probe that 
interrogates a 1 mm diameter area on the cervix. Although the single-pixel probe that 
the inventors have used provides the capability to determine whether a small region of 
cervical tissue contains pre-cancerous changes, mapping the entire cervix with this 
system is extremely time consuming, making wide-scale application of this 

25 technology impractical. To address this limitation, a multi-pixel probe that can be 
used to acquire fluorescence spectra from multiple sites on the cervix, simultaneously 
may be used. This may provide to a user not only information regarding the presence 
of pre-cancer but can also indicate its location and extent. 

In summary, in vivo fluorescence spectroscopy has the capability to 

30 significantly improve the sensitivity of Pap smear screening and the specificity of 
colposcopy in expert hands. Hence, this technique may play an important clinical role 
as a screening / re-screening tool (to screen women who have already had an initial 
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positive Pap smear, but who have not undergone colposcopy and directed biopsy) and 
as an adjunct to colposcopy in expert hands. Advantages realized by using this 
technique include, but are not limited to: (1) screening and diagnostic information 
may be obtained in near real-time and (2) this technique may be easily automated 
hence reducing the need for subjective interpretation. Furthermore, while the Pap 
smear examines only exfoliated cervical epithelial cells, fluorescence spectroscopy 
may interrogate the full thickness of the epithelium. 
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EXAMPLE 3 

Head and Neck Analysis- Fluorescence 

Analysis of fluorescence data collected in a clinical head and neck study has 
been analyzed in accordance with the present disclosure. The Example that follows 
describes analysis of these data. 

Materials and Methods 

Fluorescence excitation emission matrices were measured in vivo from sixty 
two sites in 9 normal volunteers and 11 patients with a known or suspected 
premalignant or malignant oral cavity lesion. Excitation wavelength ranged from 330 
to 500 nm and emission wavelength ranged from 340 to 600 nm. Fluorescence data 
were analyzed to determine which excitation and emission wavelengths contained the 
most diagnostically useful information and to estimate the performance of diagnostic 
algorithms based on this information. Algorithms were developed based on 
combinations of emission spectra at various excitation wavelengths in order to 
determine which excitation wavelengths contained the most diagnostic information. 
Then, at those excitation wavelengths, algorithms were developed based on reduced 
numbers of emission wavelengths to determine whether complete emission spectra 
were required or whether accurate diagnosis could be made using multi-spectral 
measurements at a few excitation/emission wavelength combinations. The algorithm 
development process, consisted of the following steps: ; (1) data pre-processing to 
reduce inter-patient variations, (2) data reduction to reduce the dimensionality of the 
data set, (3) feature selection and classification to develop algorithms which maximize 
diagnostic performance and minimized the likelihood of over-training in a training set, 
(4) unbiased evaluation of these algorithms using the technique of cross-validation. 
Results 

The optimal excitation wavelengths for the in vivo detection of oral cancers 
with fluorescence spectroscopy were found to be 350, 380 and 400 nm. An unbiased 
estimate of an algorithm based on the entire emission spectra at these excitation 
wavelengths yields a sensitivity of 100% and specificity of 88%. Increasing the 
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number of excitation wavelengths did not improve algorithm performance.. Better 
algorithm performance was obtained when data were normalized to the peak emission 
intensity of the concatenated vector than when each emission spectrum was 
normalized to its own peak emission wavelength. The number of emission 
wavelengths could be significantly reduced without compromising algorithm 
performance. When only a single emission wavelength of 472 nm, common to all 
three excitation wavelengths, was used algorithm performance on cross validation was 
90% sensitivity and 88% specificity. The unbiased performance estimate for the 
diagnostic algorithms based on fluorescence spectroscopy have a higher sensitivity 
than current visual screening techniques done by experts. 

Study Subjects 

9 normal volunteers and 1 1 patients with a known or suspected premalignant 
or malignant oral cavity lesion were recruited to participate in the study at the Head 
and Neck Surgery Clinical at The University of Texas M.D. Anderson Cancer Center. 
Written informed consent was obtained from each person in the study. 

Instrument 

A FastEEM system in accordance with the present disclosure was used for this 
study. Briefly, the system measured fluorescence emission spectra at 18 excitation 
wavelengths, ranging from 330 nm to 500 nm in 10 nm increments. The system 
incorporated a fiberoptic probe, a Xenon arc lamp coupled to a monochromator to 
provide excitation light and a polychromator and; thermo-electrically cooled CCD 
camera to record fluorescence intensity as a function of emission wavelength. 

Calibration 

A background EEM, to be subtracted from the acquired patient data, was 
obtained with the probe immersed in a non-fluorescent bottle filled with distilled 
water at the beginning of each measurement day. Then a fluorescence EEM was 
measured with the probe placed on the surface iof a quartz cuvette containing a 
solution of Rhodamine 610 (Exciton, Dayton, OH) dissolved in ethylene glycol (2 
mg/mL). 
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To correct for the non-uniform spectral response of the detection system, the 
spectra of two calibrated sources were measured; in the visible an NIST traceable 
calibrated tungsten ribbon filament lamp was used and in the UV a deuterium lamp 
was used (550C and 45D, Optronic Laboratories Inc, Orlando, FL). Correction factors 
were derived from these spectra. Background subtracted EEMs from patients were 
then corrected for the non-uniform spectral response of the detection system. 
Variations in the intensity of the fluorescence excitation light source at different 
excitation wavelengths were corrected using measurements of the intensity at each 
excitation wavelength at the probe tip made using a calibrated photodiode (818-UV, 
Newport Research Corp.). Finally, corrected fluorescence intensities from each site 
were divided by the fluorescence emission intensity of the Rhodamine standard at 460 
nm excitation, 580 nm emission. Thus, data illustrated in this paper are not the 
absolute fluorescence intensities of tissue but rather the intensities relative to the 
Rhodamine standard. , tv 

Data Aquisition 1 * 

Before the probe was used it was disinfected with Metricide (Metrex Research 
Corp.) in accordance with standard protocol. The probe was then guided into the oral 
cavity and its tip positioned flush with the mucosa. Then fluorescence EEMs were 
measured. 

Fluorescence EEMs were measured from 9 volunteers with no history of oral 
cavity neoplasia at 35 clinically normal sites in the oral cavity (table 1). No biopsies 
were obtained from volunteers. Following visual screening in 11 patients with a 
known or suspected premalignant or malignant oral cavity lesion, fluorescence EEMs 
were measured from 27 sites (Table 1). The physician placed the fiber optic probe on 
a lesion or suspected lesion and the fluorescence of that site was measured. In 
addition to the three to five visually abnormal sites, fluorescence EEMs were 
measured from one to three contralateral normal sites. Post-spectroscopy, abnormal 
sites were tattooed with India Ink where the probe measured the spectra. A clinical 
diagnosis of each lesion as normal, abnormal (not dysplastic), abnormal (dysplastic) 
or cancerous was recorded by an experienced head and neck surgeon (AMG) or dental 
oncologist (RJ). During follow up surgery, a 2-4 mm biopsy of the tissue was taken 
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from the tattooed area. These specimens were evaluated by an experienced 
pathologist (BK) using light microscopy and classified as normal, mucosal reactive 
atypia (MRA), dysplasia or cancer using standard diagnostic criterion. Biopsies with 
multiple diagnoses were classified according to the most severe pathological 
5 diagnosis. The pathologist and clinicians were blinded to the results of the 
spectroscopic analyses. 

Data Review 

A total of 88 sites were measured from 26; subjects. All spectra were reviewed 
by a single investigator blinded to the pathologic results (DLH). Spectra were 

10 discarded if files were not saved properly due to software error (8 sites), instrument 
error (2 sites), operator error (4 sites), probe movement (3 sites), and the presence of 
room light artifacts at wavelengths below 600 nm (3 sites) in at least one of the 
emission spectra. From the remaining sites, spectra from six sites were excluded 
because the tattoo could not be located and consequendy reliable histologic diagnosis 

15 was not available for these sites. Therefore, fluorescence EEMs from 62 sites from 20 
subjects were available for further analysis (Table 1). 

Data Analysis 

Fluorescence data were analyzed to determine which excitation and emission 
wavelengths contained the most diagnostically useful information and to estimate the . 

20 performance of diagnostic algorithms based on this information. Algorithms based on 
multi-variate discriminant analysis were ooteidered. Algorithms based on 
combinations of emission spectra at various excitation wavelengths were developed in 
order to determine which excitation wavelengths contained the most diagnostic 
information. Then, at those excitation wavelengths, spectra based on reduced 

25 numbers of emission wavelengths were developed to determine whether complete 
emission spectra were required or whether accurate diagnosis could be made using 
multi-spectral measurements at a few excitation/emission wavelength combinations. 

In each case, the algorithm development process, described in detail below, 
included the following major steps: (1) data pre-processing to reduce inter-patient 

30 variations, (2) data reduction to reduce the dimensionality of the data set, (3) feature 
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selection and classification to develop algorithms which maximized diagnostic 
performance and minimized the likelihood of over-training in a training set, (4) 
unbiased evaluation of these algorithms using the technique of cross-validation. 

Diagnostic Categories 

5 Multi-variate discriminant algorithms were sought to separate two tissue 

categories: normal and abnormal. The abnormal class contained sites with dysplasia, 
carcinoma in situ and squamous cell carcinoma; the normal class contained sites 
which were clinically and/or histologically normal as well as benign changes such as 
inflammation. 

1 0 Data Pre-processing 

Fluorescence data from a single measurement site is represented as a matrix 
containing calibrated fluorescence intensity as a function of excitation and emission 
wavelength. Columns of this matrix correspond to emission spectra at a particular 
excitation wavelength; rows of this matrix correspond^ to T excitation spectra at a 

15 particular emission wavelength. Each excitation spectrum contains 18 intensity 
measurements; each emission spectrum contains between 50 and 130 intensity 
measurements depending on the excitation wavelength. Most multi-variate data 
analysis techniques require vector input rather than matrix input, so the column 
vectors containing the emission spectra at excitation wavelengths selected for 

20 evaluation were concatenated into a single vector in order to explore which excitation 
wavelengths contained the most diagnostic information. 

Our previous work illustrated that spectra of oral cavity obtained in vivo show 
large patient to patient variations in intensity that can be greater than the inter- 
category differences. Therefore, the inventors explored pre-processing methods to 

25 reduce the inter-patient variations, while preserving inter-category differences. While 
many different methods of pre-processing are possible, two methods were selected for 
evaluation here: (1) normalization of all emission spectra of a given excitation 
wavelength combination to the maximum intensity contained within that combination, 
and (2) normalization of each emission spectra to its maximum intensity. 



WO 99/57529 PCT/US99/09768 

73 

Reduction of Excitation Wavelength Number 

In this study, fluorescence emission spectra were measured at 18 different 
excitation wavelengths. One goal of data analysis was to determine which 
combination of excitation wavelengths contains the most diagnostic information. The 
inventors considered combinations of up to four emission spectra. Limiting the 
number of wavelengths to four allows for construction of a reasonably cost-effective 
clinical spectroscopy system. Two strategies were considered to identify the optimal 
wavelength combination. The first was to identify the single wavelength which gives 
the best diagnostic performance, then the wavelength of those remaining that most 
improves diagnostic performance, and so forth until performance no longer improves 
or four wavelengths have been selected. The second method was to evaluate all 
possible combinations of up to four wavelengths chosen from the 18 possible 
excitation wavelengths. This equates to 18 combinations of one, 153 combinations of 
two, 816 combinations of three, and 3,060 combinations of four excitation 
wavelengths, for a total of 4,047 combinations;,, While the first method requires less 
computational time, it is only appropriate for normalization methods that remove 
relative intensity information. Otherwise, the best .single wavelength may not be part 
of the best wavelength pair that exploits differences in relative intensity. The second 
method can be used with either normalization scheme and in addition, provides a tool 
to rank the top wavelength combinations, rather than identifying the single best 
wavelength combination, so this method was pursued. 

Algorithm Development 

For each of the 4,047 combinations of one to four excitation wavelengths, 
spectra from the entire data set were used as a training set to develop multi-variate 
algorithms to separate normal and abnormal tissues based on their fluorescence 
emission spectra at all possible wavelength combinations. Algorithm development 
included of three steps: (1) pre-processing, (2) data reduction and (3) development of 
a classification algorithm which maximized diagnostic performance. Data were pre- 
processed using the two normalization schemes described above. For each 
normalization, principal component analysis was performed using the entire dataset 
and eigenvectors accounting for 65, 75, 85, and 95% of the total variance were 
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retained. Principal component scores associated with these eigenvectors were 
calculated for each sample. Discriminant functions were then formed to classify each 
sample as normal or abnormal. The classification was based on the Mahalanobis 
distance, which is a multivariate measure of the separation of a point from a dataset in 
5 n-dimensional space. Each sample was held out one at a time and the Mahalanobis . 
distances between to the held out sample and the remaining normal and abnormal 
samples were calculated; the sample was classified according to the category 
corresponding to the smallest distance. The ; sensitivity and specificity of the 
algorithm were then evaluated relative to diagnoses based on histopathology (in 

10 patients suspected to have oral cavity malignancy) or clinical impression (in normal 
volunteers). Overall diagnostic performance was evaluated as the sum of the 
sensitivity and the specificity, thus minimizing the number of misclassifications 
(when prevalence of disease and normal are approximately equal). The performance 
of the diagnostic algorithm depended on the principal component scores which were 

15 included. Four different diagnostic algorithms were developed using principal 
component scores derived from eigenvectors accounting for increasing amounts of 
total variance. From the available pool of principle component scores, the single 
principal component score yielding the best ijiitial performance was identified, and 
then the principal component score that most improved this performance was selected. 

20 This process was repeated until performance is no longer improved by the addition of 
principal components scores, or all available scores were selected. The pool of 
available eigenvectors is specified by a variance criterion, eigenvector significance 
level (ESL), that represents the minimum variance fraction accounted for by the sum 
of the n largest eigenvalues. In this work the inventors examined 4 ESLs, 

25 corresponding to 65%, 75%, 85% and 95% of the total variance. 

Comparing Performance of Various Excitation Wavelength Combinations 

At each ESL, the wavelength combinations were ranked in order of decreasing 
performance, based upon the sum of sensitivity and specificity. The combinations 
were ranked and evaluated based upon training performance. However, as the ESL 
30 approaches 100%, over-training becomes more likely, since the available pool of 
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eigenvectors will account for nearly 100% of the variance, including variance, due to 
noise. The magnitude of diagnostically important variances is unknown. 

The risk of over-training risk was assessed at the top 25 wavelength 
combinations of two, three, and four excitation wavelengths, by comparing the 
training set performance to the performance of an algorithm developed from the same 
data after the diagnoses corresponding to each measurement site had been 
randomized. This provides a dataset with the same variance structure as the original 
dataset, but where the diagnostic performance is not expected to exceed that of 
chance. In order to make equivalent comparisons, the disease prevalence in the real 
sample was maintained in the randomly assigned diagnoses. Diagnostic algorithms 
were then developed again which minimized the number of misclassified samples at a 
specified eigenvector significance level (ESL). Random diagnoses were assigned fifty 
times for each wavelength combination and the average and standard deviation of the 
sum of the sensitivity and specificity were calculated. Ideally, for completely 
normally distributed data, the sum of the sensitivity and specificity should be one for 
the randomized diagnosis at all levels of training significance. However, if over- 
training occurs, this sum will be greater than one.. The top 25 wavelength 
combinations were then ranked again based on the absolute difference between the 
training set performance and random diagnosis assignment. This method allows the 
top wavelength combinations to be ranked in order of their robustness, or lack of 
propensity to over-train. For a given number of wavelengths per combination, the 
differences were ranked across all four eigenvector significance levels. The largest 
difference, usually seen at ESL values of 65%, was selected as the optimal wavelength 
combination. This criterion selects the wavelength combination that is least prone to 
over-training. 

Validation of Algorithm Performance 

Although the optimal wavelength combination has been identified based upon 
comparison of its performance to that which can be achieved when the tissue 
diagnoses have been randomized, our estimates of algorithm performance are still 
biased since they are based on the same training set used to develop the algorithm. An 
unbiased performance estimate must be made to assess the true potential of this 
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wavelength combination. The effects of over-training in performance estimation can 
be minimized by using separate training and validations sets, or by using the method 
of cross-validation. The data set here was not sufficiently large to divide into separate 
training and validation sets, therefore the inventors used the cross-validation method. 
5 In this method, all data from one patient are temporarily removed from the data set, 
the algorithm is developed using the remaining data set, and then the new algorithm is 
applied to the left out sites. This is repeated until data from each patient has been left 
out once. Cross validation was used to provide an unbiased estimate of the 
performance of the top three combinations of excitation wavelengths with each 
10 normalization. 

Reduction of Emission Wavelength Number 

The inventors investigated whether effective diagnostic algorithms could be 
developed using reduced numbers of emission wavelengths at the top performing 
excitation wavelength combinations. The inventors calculated the component 

15 loadings associated with the eigenvectors corresponding to the principal component 
scores selected in these algorithms. A component loading represents the correlation 
between each principal component and the original pre-processed fluorescence 
emission spectra at each excitation wavelength. The component loadings at each 
excitation wavelength were evaluated to select fluorescence intensities at a minimum 

20 number of excitation-emission wavelength pairs required for the algorithms to 
perform with 'a minimal decrease in classification accuracy. Portions of the 
component loadings most highly correlated (correlation >0.5 or <-0.5) with 
corresponding emission spectra at each excitation wavelength were selected and the 
reduced data matrix was then used to regenerate and evaluate the algorithms. 

25 Results 

Fluorescence EEMs from 62 sites from 20 subjects were available for further 
analysis (Table 1). Of these 62 sites, 37 were measured from the tongue, eight from 
the floor of mouth (FOM), seven from the buccal mucosa, four from the gingiva, one 
from the palate, and five from the lip. There were 52 normal, four dysplastic, and six 
30 cancerous sites. The data set consisted of two types of normal sites: adjacent normals 
and normals from a population without oral cancer. Adjacent normals are the visually 
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normal sites taken from patients that have suspected lesions elsewhere in the oral 
cavity. In this data set there were 17 adjacent normal (histologically normal) sites 
from eleven patients, and 35 visually normal sites taken from nine patients. 

The visual screening accuracy of the head and neck physicians for this data set 
5 was 100% sensitivity and 83% specificity. This performance was determined by . 
comparing the visual impressions of the clinicians to the histologic findings upon 
excision. Results of the analysis of the spectroscopic data are presented according to 
the normalization method used. 

Normalization by peak emission intensity of the concatenated vector 

10 The top 25 combinations of one to four excitation wavelengths were ranked in 

order of the largest difference in the sum of the sensitivity and the specificity in the 
training set and the average performance with randomly assigned diagnoses. The top 
3 combinations correspond to the following excitation wavelength combinations: (350 
380 400 480), (350 380 400 490), and (350 380 400). All of these combinations 

15 demonstrate approximately the same training set performance, with 100% sensitivity 
and 90% specificity. These combinations have three wavelengths in common. Since 
no performance benefit was observed when a fourth wavelength was added for the top 
performing combinations, combinations of four wavelengths were not pursued any 
further. The top 25 combinations of three excitation wavelengths, ranked in order of 

20 the largest difference in the sum of the sensitivity and the specificity in the training set 
and the average performance with randomly assigned diagnoses are given in Table 2. 
The ranking of each combination based upon training set performance is given as 
well. Table 2 gives the diagnostic performance, of each combination for both the 
training set and the average performance for the data set with randomized diagnosis. 

25 The random diagnosis performance demonstrated that the combinations showed 
varying propensities to over-train. 

A histogram depicting the frequency at which each wavelength appeared in the 
top 25 combinations from Table 2 is shown in Figure 34 for various ESLs. At low 
ESL values of 65%, 75% and 85% the diagnostic importance of excitation at 350, 

30 380, and 400 nm is evident. This is seen in the histograms for wavelength 
combinations of two and four as well (data not shown). 
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To provide an unbiased estimate of performance of these algorithms, the 
diagnostic performance of the top wavelength combinations was evaluated by using 
the method of cross-validation using the full data set. The wavelength combination 
(350, 380, 400 nm) demonstrated a cross validation performance of 100% sensitivity 
and 88% specificity. The other two combinations (350, 380, 400, 480 nm) (350, 380, 
400, 490 nm) demonstrated identical performance upon cross validation with a 
sensitivity of 100% and a specificity of 90%. 

The emission spectra corresponding to all 62 sites at the three excitation 
wavelengths common to these combinations are shown in Fig. 36. Visual 
examination of Fig. 36 confirms the diagnostic potential of this wavelength 
combination. The identified combinations demonstrate the importance of the relative 
intensities as seen following normalization to the maximum intensity in the 
concatenated emission vector. With this normalization, the normal sites demonstrate 
greater fluorescence intensity at 380 nm excitation, 450 nm emission than the 
abnormal sites. Additionally, the remaining emission peaks tend to be more intense in 
normal sites than for abnormal sites in most instances. The normal sites misclassified 
as abnormal arc easily seen in Figure 36. Histologically, these sites demonstrated 
increased vascularity, suggesting that the increased hemoglobin absorption is one 
cause of the reduced relative fluorescence intensity: from these sites. 

The algorithm based on the combination of 350, 380 and 400 nm excitation 
wavelengths selected only a single principal component score, associated with the 
eigenvector that accounted for most of the total variance, Figure 37 shows this 
eigenvector and the associated component loading. The eigenvector depicts the 
general lineshape of the normalized spectra shown in Figure 37. The component 
loading shows that the principal component score for this eigenvector is highly 
correlated to approximately four regions of the concatenated emission vector. Single 
emission intensities within these ranges were selected arbitrarily and are denoted as 
solid green circles in Figure 37. These points coirespond to the emission intensities of 
418 and 470 nm at 350 nm excitation, 448 nm emission at 380 nm excitation, and 
502 nm emission at 400 nm excitation. An algorithm was developed using the same 
data reduction and classification methods as above based upon this reduced data set. 
The training performance of the reduced algorithm is 100% sensitivity and 90% 
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specificity, and the cross- validated performance is 90% sensitivity and 90% specificity 
compared to 100% sensitivity and 88% specificity of for the algorithm based on the 
entire emission spectra. This algorithm uses a higher ESL of 95% since the reduced 
data set contains less variance introduced by noise. Motivated by the desire to 
5 construct a simple device that could interrogate or image large areas of tissue, a 
reduced algorithm based upon a single emission wavelength was evaluated. The 
emission wavelength chosen was common to all three emission spectra, 472 nm. The 
training performance of this reduced algorithm was 100% sensitivity, 88% specificity, 
and upon cross validation it was 90% sensitivity and 88% specificity. 

10 Normalization of each emission spectra by its peak emission prior to concatenation 

The analysis was repeated using concatenated vectors in which each emission 
spectrum was normalized to its peak intensity. This method removes relative intensity 
information and relies on differences in fluorescence Iineshape. The maximum 
difference between training performance and the performance after random diagnosis 

15 assignment was 0.58 compared to 0.82 using the other normalization method. 
Consequently, the top wavelength combination identified (350, 380, 400, 430 nm) 
showed poor performance upon cross-validation with a sensitivity of 50% and a 
specificity of 88%. It is interesting to note that the previously identified wavelengths, 
(350 r 380, 400 nm) are also a part of this combination, indicating that the line shape at 

20 these wavelengths contains diagnostic information. 

Discussion and Conclusions 

This Example identified the optimal excitation wavelengths for in vivo 
detection of oral cancers with fluorescence spectroscopy. The optimal excitation 
wavelengths were found to be 350, 380 and 400 nm. An unbiased estimate of an 

25 algorithm based on the entire emission spectra at these excitation wavelengths yields a 
sensitivity of 100% and specificity of 88%. Increasing the number of excitation 
wavelengths did not improve algorithm performance. Better algorithm performance 
was obtained when data were normalized to the peak emission intensity of the 
concatenated vector than when each emission spectrum was normalized to its own 

30 peak emission wavelength. The discriminating ability of this wavelength combination 
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is due to differences in both relative intensity and spectral line shape. The number of 
emission wavelengths could be significantly reduced as well without compromising 
algorithm performance. An algorithm based on four emission intensities: 418 and 470 
nm at 350 nm excitation, 448 nm emission at 380 nm excitation, and 502 nm emission 
at 400 nm excitation yielded 90% sensitivity and 90% specificity upon cross- 
validation. When only a single emission wavelength of 472 nm, common to all three 
excitation wavelengths, was used algorithm performance on cross validation was 90% 
sensitivity and 88% specificity. 

The unbiased performance estimate for the diagnostic algorithms based on 
fluorescence spectroscopy have a higher sensitivity than current visual screening 
techniques done by experts. In their hands, visual screening has been reported to have 
a sensitivity of 74% and specificity of 99%. The performance of visual screening by 
experts in this study was 100% sensitivity, 83% specificity. 

It is interesting to note that emission specjra obtainoci at 400 nm excitation are 
included in a majority of the top combinations. Hemoglobin has a strong absorption 
maximum near this location, suggesting that differences in absorption due to perfusion 
may offer diagnostic information. This suggests that the combinations of reflectance 
and fluorescence spectroscopy may offer improved diagnostic performance. 

Head and Neck Analysis- Reflectance 

A FastEEM system was also used to measure tissue reflectance spectra over 
the visible region of the spectrum at three source-detector fiber separations. The 
inventors have analyzed these data with at least two goals: (1) to determine the 
diagnostic potential of reflectance spectroscopy for detection of neoplasia of the oral 
cavity, and (2) to determine the combined diagnostic potential of fluorescence and 
reflectance spectroscopy for detection of neoplasia of the oral cavity. 

Study Design 

9 normal volunteers and 1 1 patients with a known or suspected premalignant 
or malignant oral cavity lesion were recruited to participate in the study at the Head 
and Neck Surgery Clinical at The University of Texas M.D. Anderson Cancer Center. 
Written informed consent was obtained from each person in the study. 
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Instrument 

The spectroscopic system used to measure reflectance spectra has been 
described in detail previously and is briefly summarized here. It includes of a Xenon 
arc lamp and a 295 nm long-pass filter which provides broadband illumination, a fiber 
optic probe which directs light to the tissue and collects diffusely reflected light from ■ 
three locations (position 1, position 2, position 3), and an imaging spectrograph and 
CCD which detects the reflected light intensity as a function of wavelength. Fibers for 
illumination and collection of diffuse reflectance are arranged in a ring at the edge of 
the probe. The collection fibers are located 1.1, 2.1 and 3 mm from a single 
illumination fiber. All fibers have a core diameter of 200 microns. White light from 
the Xe lamp is coupled to the proximal end of the illumination fiber. The distal ends 
of the fibers are flush with the probe tip and placed in direct contact with the sample 
surface. Using this system, oral cavity tissue reflectance spectra from 390-590 nm 
with a spectral resolution of 4 nm were collected in approximately 30 seconds. The 
signal to noise ratio exceeded 75: 1 for 90% of the data. 

Procedure 

Reflectance spectra were wavelength calibrated with a mercury light source. 
Dark current and background were recorded before each measurement with the same 
settings but with illumination turned off. These background measurements were 
subtracted from each reflectance measurement offline. Reflectance data are reported 
relative to a 2.68% by volume solution of 1.072 micron diameter polystyrene 
microspheres (Polyscience Inc., Warrington, PA). The probe was placed on the 
outside wall of a 1 cm path length cuvette containing the microsphere solution. The 
total integrated reflectance of this standard was measured on a double beam 
spectrophotometer (U-3300 Hitachi, Tokyo, Japan) with an integrating sphere 
attachment (Labsphere Inc., North Sutton, NH). This was used to correct the 
reflectance measurements of the microsphere solution made with the spectroscopic 
system. Tissue spectra at each collection fiber position were divided pointwise by the 
corrected standard reflectance spectrum at the corresponding fiber position. 

Before the probe was used it was disinfected with Metricide (Metrex Research 
Corp.) in accordance with standard protocol. The probe was then guided into the oral 
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cavity and its tip positioned flush with the mucosa. Then reflectance spectra were 
measured. 

Reflectance spectra were measured from 9 volunteers with no history of oral 
cavity neoplasia at 35 clinically normal sites in the oral cavity (see Table 3). No 

5 biopsies were obtained from volunteers. Following visual screening in 11 patients . 
with a known or suspected premalignant or malignant oral cavity lesion, reflectance 
spectra were measured from 27 sites. The physician placed the fiber optic probe on a 
lesion or suspected lesion and the reflectance of that site was measured. In addition to 
the three to five visually abnormal sites, reflectance spectra were measured from one 

10 to three contralateral normal sites. Post-spectroscopy, abnormal sites were tattooed 
with India Ink where the probe measured the spectra A clinical diagnosis of each 
lesion as normal, abnormal (not dysplastic), abnormal (dysplastic) or cancerous was 
recorded by an experienced head and neck surgeon (AMG) or dental oncologist (RJ). 
During follow up surgery, a 2-4 mm biopsy of the tissue, was taken from the tattooed 

15 area. These specimens were evaluated by an experienced pathologist (BK) using light 
microscopy and classified as normal, mucosal reactive atypia (MRA), dysplasia or 
cancer using standard diagnostic criterion. Biopsies with multiple diagnoses were 
classified according to the most severe pathological diagnosis. The pathologist and 
clinicians were blinded to the results of the spectroscopic analyses. 

20 Data Analysis * 

Reflectance spectra were further processed to reduce noise. A moving average 
with a width of 10 nm was applied to each spectrum; following this, intensities of all 
reflectance spectra were extracted in 5 nm steps from 400 to 585 nm and individually 
analyzed. In addition, the first (slope) and second derivatives of the reflectance spectra 

25 were calculated between 400 and 580 nm in 5 nm steps. 

An exploratory data analysis was carried out to determine which source- 
detector separations and wavelength regions were useful to separate three tissue 
categories: normal, dysplasia and cancer. The normal class contained sites which were 
clinically and/or histologically normal as well as benign changes such as 

30 inflammation. 
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For each diagnostic category (normal, dysplasia, cancer) the inventors 
calculated the average value and standard deviation of the intensity at each 
wavelength, and the first and second derivative at each wavelength. These values were 
calculated separately for each source detector separation. The Student's t-test was 
used to determine whether differences in these mean values were statistically 
significant between groups of two categories. The inventors examined normal tissues 
vs. abnormal tissues (dysplasia and cancer) as well as normal tissues vs. dysplasia. 

Parameters which were most statistically significant, corresponding to the 
lowest p-values, were examined further for diagnostic ability. The inventors 
constructed two-dimensional scatter plots which showed the most statistically 
significant parameter values for each site measured to determine which parameters 
could most effectively discriminate between the two categories of normal and 
abnormal (dysplasia and cancer). All calculations and graphs were produced with the 
Matlab® (Mathworks Inc.) and the Statistical Tooibox for Matlab. 

Results 

Figures 38 through 40 show the reflectance spectra, first and second derivative 
at each of the three source detector separations for all sites measured. Figures 41 
through 43 show the average value plus and minus one standard deviation for normal, 
dysplastic and cancer sites. Normal sites are shown in green, dysplasia in blue and 
cancer in red. In general, the spectra of cancer sites show the highest reflectance 
intensity at all wavelengths measured, while spectra of normal and dysplastic sites are 
lower in intensity and more similar. Differences in intensity are greatest at position 1 
and least at position 3. The slope and second derivative of the reflectance spectra are 
greater (lower) for cancers at 440 and 480 nm (520 nm). 

Figure 44 shows the p values comparing the mean intensity, mean first and 
second derivatives of normal tissue versus abnormal tissues, at each wavelength at the 
three different source detector separations. Figure 45 shows the p values comparing 
the mean intensity, mean first and second derivatives of normal tissue versus 
dysplastic tissues, at each wavelength at the three different source detector 
separations. A low value indicates a statistically significant result; the inventors are 
particularly interested in those with values less than 0.05. 
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At each source-detector fiber separation, the inventors ranked the intensity, 
first and second derivatives at each wavelength in order of increasing p-value. Tables 
4-6 show the results when normal and abnormal tissues were compared, Tables 7-9 
show the results when normal and dysplastic tissues were compared. Results are 
5 shown for p- values less than or equal to 0.05. 

In order to explore the diagnostic contributions provided by these wavelength 
regions, the inventors highlighted all regions where the p-value was less than or equal 
to 0.01 for first and second derivatives and less than or equal to 0.02 for intensity. 
These values are highlighted in gray in tables 4-9. This resulted in a total of 15 

10 different parameters. The slope and second derivative near 440-460 nm at positions 1 
and 2 were identified as diagnostically useful regions, as was the slope and second 
derivative near 500-510 nm at position 3. The intensity from 450-51 nm and 570-585 
nm at position 2 were also identified as diagnostically useful. 

Two dimensional scatterplots containing all possible pairwise combinations of 

15 these 15 groups of parameters were generated (105 total combinations). Figures 46- 
48 show three representative examples. Figure 46 shows the second derivative at 430 
nm for position 2 vs. the second derivative at 495 nm for position one. The straight 
line represents an algorithm to separate normal findings from dysplasias and cancers, 
and results in a sensitivity of 80% and a specificity of 85%. Figure 47 shows the 

20 second derivative at 450 nm for position 1 vs. ; the first derivative at 510 nm for 
position three. The straight line represents an algorithm; to separate normal findings 
from dysplasias and cancers, and results in a sensitivity of 80% and a specificity of 
82%. Figure 48 shows the second derivative at 410 nm for position 1 vs. the first 
derivative at 510 nm for position three. The straight line represents an algorithm to 

25 separate normal findings from dysplasias and cancers, and results in a sensitivity of 
70% and a specificity of 75%. In each case, the lines were drawn to minimize the 
total number of samples misclassified. These sensitivity and specificity values are 
slightly lower than those achieved in the previous section using fluorescence alone, 
and reflect the greater overlap in the reflectance of tissues from the three groups than 

30 is seen in the fluorescence spectra. However, the fluorescence algorithms were based 
on multi-variate classifiers to enable the use of more than two parameters in the 
algorithm. These techniques were next pursued using reflectance spectra. 
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Multi-Variate Discriminant Algorithms 

Reflectance spectra were analyzed to determine which wavelength ranges and 
source-detector fiber separations contained the most diagnostically useful information 
and to estimate the performance of multi-variate diagnostic algorithms based on this 
5 information. The inventors considered algorithms based on multi-variate discriminant 
analysis. First, the inventors developed algorithms based on reflectance spectra, or 
their first or second derivatives over various wavelength ranges at each source- 
detector fiber separation in order to determine which types of spectra, wavelength 
ranges and fiber separations contained the most diagnostic information. In addition, 
10 the inventors developed algorithms using the concatenated spectra (or their first or 
second derivatives) at all fiber separations over various wavelength ranges. In each 
case, the algorithm development process, described in detail below, consisted of the 
following major steps: (1) data reduction to reduce the dimensionality of the data set, 

(2) feature selection and classification to develop algorithms which maximized 
15 diagnostic performance and minimized the likelihood of over-training in a training set, 

(3) unbiased evaluation of these algorithms using the technique of cross-validation. 

Diagnostic Categories < v . 

Multi-variate discriminant algorithms were sought to separate two tissue 
categories: normal and abnormal. The abnormal class contained sites with dysplasia, 
20 carcinoma in situ and squamous cell carcinoma; the normal class contained sites 
which were clinically and/or histologically normal as well as benign changes such as 
inflammation. 

Algorithm Development 

For each of the different types of spectra and wavelength ranges, spectra from 
25 the entire data set were used as a training set to develop multi-variate algorithms to 
separate normal and abnormal tissues based on their reflectance. Algorithm 
development included two steps: (1) data reduction and (2) development of a 
classification algorithm which maximized diagnostic performance. For each type of 
data, principal component analysis was performed using the entire dataset and 
30 eigenvectors accounting for 65, 75, 85, 95% and 99% of the total variance were 
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retained. Principal component scores associated with these eigenvectors were 
calculated for each sample. Discriminant functions were then formed to classify each 
sample as normal or abnormal. The classification was based on the Mahalanobis 
distance, which is a multivariate measure of the separation of a point from a dataset in 
n-dimensional space. Each sample was held out one at a time and the Mahalanobis 
distances between to the held out sample and the remaining normal and abnormal 
samples were calculated; the sample was classified according to the category 
corresponding to the smallest distance. The sensitivity and specificity of the 
algorithm were then evaluated relative to diagnoses based on histopathology (in 
patients suspected to have oral cavity malignancy) or clinical impression (in normal 
volunteers). Overall diagnostic performance was evaluated as the sum of the 
sensitivity and the specificity, thus minimizing the number of misclassifications 
(when prevalence of disease and normal are approximately equal). The performance 
of the diagnostic algorithm depended on the principal component scores which were 
included. Five different diagnostic algorithms were developed using principal 
component scores derived from eigenvectors accounting for increasing amounts of 
total variance. From the available pool of principle component scores, the single 
principal component score yielding the best initial performance was identified, and 
then the principal component score that most improved this performance was selected. 
This process was repeated until performance was no longer improved by the addition 
of principal components scores, or all available scores were selected. The pool of 
available eigenvectors is specified by a variance criterion, eigenvector significance 
level (ESL), that represents the minimum variance fraction accounted for by the sum 
of the n largest eigenvalues. In this work the inventors examined 5 ESLs, 
corresponding to 65%, 75%, 85%, 95% and 99% of the total variance. 

Comparing Performance of Various Data Types and Wavelength Ranges 

At each ESL, wavelength range and type of data the inventors calculated the 
sum of sensitivity and specificity. As the ESL approaches 100%, over-training 
becomes more likely, since the available pool of eigenvectors will account for nearly 
100% of the variance, including variance due to noise. The magnitude of 
diagnostically important variances is unknown. The risk of over-training risk was 
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assessed for each of the types of input data, by cornparing the training set performance 
to the performance of an algorithm developed from the same data after the diagnoses 
corresponding to each measurement site had been randomized. This provides a 
dataset with the same variance structure as the original dataset, but where the 
5 diagnostic performance is not expected to exceed that of chance. In order to make 
equivalent comparisons, the disease prevalence in the real sample was maintained in 
the randomly assigned diagnoses. Diagnostic algorithms were then developed again 
which minimized the number of misclassifiei samples at a specified eigenvector 
significance level (ESL). Random diagnoses were assigned fifty times for each 

10 wavelength combination and the average and standard deviation of the sum of the 
sensitivity and specificity were calculated. Ideally, for completely normally 
distributed data, the sum of the sensitivity and specificity should be one for the 
randomized diagnosis at all levels of training significance. However, if over-training 
occurs, this sum will be greater than one. At each ESL, wavelength range and type of 

15 data the inventors calculated the absolute difference between the training set 
performance and random diagnosis assignment. This method allows the best types of 
data and wavelength ranges to be identified based on their robustness, or lack of 
propensity to over-train. Unlike our analysis of the fluorescence from oral cavity, in 
this case, all sensitivity and specificity values were calculated for the case of cross- 

20 validation. This proved to be necessary since the eigenvectors which contained 
diagnostically useful information contributed a relatively smaller amount of the total 
variance for reflectance than for fluorscence. The largest differences, were selected as 
the optimal data type and wavelength range. Th^s criterion selects the data type and 
wavelength range that is least prone to over-training. 

25 Results - Multi-Variate Discriminant Algorithms 

Tables 10-12 show the absolute difference between the training set 
performance and random diagnosis assignment for the different data types, 
wavelength ranges and ESLs. The inventors selected an improvement of 0.5 as 
significant for first and second derivative data and an improvement of greater than 0.4 
30 as significant for intensity data (since this is easier to measure in a multi-spectral 
imaging system). Wavelength ranges, data types and ESLs with at least this 
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improvement are highlighted in Tables 10-12. Eight types of data met these criteria; 
however, the wavelength range associated with several of them overlapped 
significandy. In this case, the combination with the best performance increase was 
selected, resulting in the following four combinations: (1) Intensity at position 2 from 
395-475 nm at 95% ESL, (2) Intensity at positions 1-3 from 425-500 nm at 99% ESL, 
(3) Slope at position 1 from 450-525 nm at 65% ESL and (4) Slope at position 3 from 
395-550 nm at 95% ESL. Table 13 gives the cross-validated sensitivity and 
specificity for algorithms based on these data types, wavelength ranges and ESLs. 
The best performance was achieved using the slope at position 3 from 395-550 nm at 
95% ESL, with a cross-validated sensitivity of 70% and a specificity of 100%. This 
compares favorably to the scatter plot shown in Figure 47, which shows the second 
derivative at 450 nm for position 1 vs. the slope at 510 nm for position three, where a 
simple linear discriminant algorithm resulted in a sensitivity of 80% and a specificity 
of 82%. 

Head and Neck Analysis- Combination of Fluorescence and Reflectance 

In general, the performance of multi-variate algorithms based on reflectance 
spectroscopy alone was somewhat lower than that based on fluorescence spectroscopy 
alone. However, from an instrumentation point of view, it may be easier to measure 
reflectance images and spectra since signal to noise ratio is higher. Therefore, the 
inventors explored the combination of reflectance and fluorescence spectroscopy and 
wheter it may provide better discrimination. Further, the inventors examined whether 
the good performance of the fluorescence algorithm may be maintained if the number 
of fluorescence excitation wavelengths were reduced, but reflectance spectra were 
measured. 

In our previous analyses, the inventors identified a combination of emission 
spectra at three excitation wavelength as optimal for diagnosis based on fluorescence 
spectroscopy and four types of reflectance data which were optimal for diagnosis. The 
inventors evaluated the performance of the following combinations of data at ESLs of 
65%, 75%, 85%, 95% and 99%: (a) Fluorescence at three excitation wavelengths + 
each type of reflectance data, (b) Fluorescence at all combinations of two excitation 
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wavelengths + each type of reflectance data, and (c) Fluorescence at each single 
excitation wavelength + each type of reflectance data. 

The performance of these combinations was compared to that which could be 
achieved with fluorescence alone. Since the number of samples where both 
fluorescence and reflectance data were available was smaller than that for either type 
of data alone, the inventors re-evaluated the performance of algorithms based on 
reflectance or fluorescence data alone using this reduced dataset. The inventors also 
evaluated the performance of fluorescence alone at one or two excitation wavelengths 
using this reduced dataset. Table 14 shows the number of patients and sites where 
both reflectance and fluorescence data were available. Results, reported as sensitivity 
and specificity giving best performance under cross validation, are shown in Tables 
15-18 for each type of reflectance data. 

The performance of the fluorescence algorithm based on three excitation 
wavelengths does not improve when any of the four types of reflectance data are also 
incorporated. The performance of fluorescence, algorithms based on two excitation 
wavelengths was lower than that for three excitation wavelengths; incorporation of 
any of the four types of reflectance spectra . did not improve performance. The 
performance of fluorescence algorithms based on a single excitation wavelength was 
lower than that for two and three excitation wavelengths. Best results were obtained 
using spectra at 400 nm excitation. Incorporation of any of the four types of 
reflectance spectra did not improve performance. 

All of the methods and apparatus disclosed and claimed herein can be made 
and executed without undue experimentation in light of the present disclosure. While 
the apparatus and methods of this invention have been described in terms of certain 
embodiments, it will be apparent to those of skill in the art that variations may be 
applied to the methods and/or apparatus described, herein without departing from the 
concept, spirit and scope of the invention. 
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CLAIMS 

1. An apparatus for performing fluorescence and spatially resolved reflectance 
spectroscopy on a sample, comprising: 
a light source; 

a monochromator in optical communication with said light source; 
a reflectance illumination fiber in optical communication with said light 
source; 

a fluorescence excitation fiber in optical communication with said 

monochromator, 
an imaging spectrograph; 

a fluorescence collection fiber in optical communication with said imaging 
spectrograph; 

a reflectance collection fiber in optical communication with said imaging 

spectrograph and in spaced relation with said reflectance illumination 
fiber; and 

and a detector in optical communication with said imaging spectrograph. 

2. The apparatus of claim 1, wherein said light source comprises a Xe arc lamp. 

3. The apparatus of claim 1, wherein said monochromator comprises a double 
monochromator. 

4. The apparatus of claim 1, wherein said detector comprises a thermo-electrically 
cooled CCD camera. 

5. The apparatus of claim 1, wherein said fluorescence excitation fiber and said 
fluorescence collection fiber are integral. 



6. The apparatus of claim 1 , wherein one or more of said fibers are positioned flush 
with said sample. 
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7. The apparatus of claim 1, further comprising a spacer positioned between one or 
more of said fibers and said sample. 

8. The apparatus of claim 1, wherein said reflectance illumination fiber, said 

5 fluorescence excitation fiber, said fluorescence collection fiber, and said reflectance 
collection fiber define a fiber optic probe. 

9. The apparatus of claim 8, wherein said probe is configured to be positioned within 
a trocar. 

10 

10. The apparatus of claim 8, wherein said probe comprises a center section and an 
outer section, said fluorescence excitation fiber and said fluorescence collection fiber 
being positioned in said center section, and said reflectance illumination fiber and said 
reflectance collection fiber being positioned in said outer section. 

15 

11. The apparatus of claim 1 , comprising a plurality of fluorescence excitation and 
collection fibers arranged in a circular bundle. ; ^ ... 

12. The apparatus of claim 1, comprising a plurality of reflectance collection fibers 
20 defining a plurality of collection positions. 

13. The apparatus of claim 12, wherein said plurality of collection positions are 
spaced between about 0 and about 10 millimeters from said reflectance illumination 
fiber- v 

14. The apparatus of claim 1, wherein said reflectance collection fiber defines a 
collection position at about 180 degrees relative to said reflectance illumination fiber. 



30 



15. The apparatus of claim 1, wherein said reflectance collection fiber defines a 
collection position at about 90 degrees relative to said reflectance illumination fiber. 
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16. The apparatus of claim 1, wherein said reflectance collection fiber defines a 
collection position at about 45 degrees relative to said reflectance illumination fiber 

17. The apparatus of claim 1, further comprising a one or more fibers in optical 
communication with said light source and configured to illuminate said sample during 
operation of said apparatus. 

18. The apparatus of claim 1, comprising a plurality of fluorescence excitation fibers 
arranged in one or more rows adjacent said monochromator. 

19. The apparatus of claim 1, comprising a plurality of fluorescence excitation fibers 
and a plurality of reflectance collection fibers arranged in a single row adjacent said 
imaging spectrograph. 

20. The apparatus of claim 19, further comprising pne or more unconnected fibers 
interspersed with said plurality of fluorescence excitation fibers and said plurality of 
reflectance collection fibers. 

21. The apparatus of claim 1, further comprising a fiber connected from said light 
source to said imaging spectrograph to monitor spectral output of said light source. 

22. The apparatus of claim 1, further comprising a controller coupled to said detector. 

23. An apparatus for measuring fluorescence and spatially resolved reflectance 
spectra of a sample, comprising: 

a light source; 

a monochromator in optical communication with said light source; 

a fiber optic probe in optical communication with said light source and with 
said monochromator, said probe comprising a plurality of fluorescence 
excitation and collection fibers in spaced relation and a plurality of 
reflectance collection fibers in spaced relation with a reflectance 
illumination fiber; 
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an imaging spectrograph in optical communication with said plurality of 
fluorescence collection fibers and with said plurality of reflectance 
collection fibers; and 

a detector in optical communication with said imaging spectrograph. 

24. The apparatus of claim 23, wherein said plurality of reflectance collection fibers 
and said reflectance illumination fiber are positioned concentrically about said 
plurality of fluorescence excitation and collection fibere. \ : 



10 25. The apparatus of claim 23, wherein at least one of said plurality of reflectance 
collection fibers defines a collection position at about 180 degrees relative to said 
reflectance illumination fiber. 



26. The apparatus of claim 23, wherein at least on&of said plurality of reflectance 
15 collection fibers defines a collection position at about 90 degrees relative to said 

reflectance illumination fiber. 

27. The apparatus of claim 23, wherein at least one of said plurality of reflectance 
collection fibers defines a collection position at alyput 45 degrees relative to said 

20 reflectance illumination fiber. 

28. The apparatus of claim 23, wherein said plurality of collection positions are 
spaced between about 0 and about 10 millimeters from said reflectance illumination 
fiber. 

25 

29. The apparatus of claim 23, wherein said probe comprises between twenty-one and 
forty-six optical fibers. 

30. A method for combined fluorescence and spatially resolved reflectance 
30 spectroscopy of a sample, comprising: 

directing radiation to said sample with a fluorescence excitation fiber, 
collecting radiation from said sample with a fluorescence collection fiber, 
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directing said radiation from said sample to an imaging spectrograph and a 
detector; 

illuminating said sample with a reflectance illumination fiber; 
collecting reflected light from said sample with a reflectance collection fiber in 
5 spaced relation with said reflectance illumination fiber; and 

directing said reflected light from said sample to an imaging spectrograph and 
a detector. 

31. The method of claim 30, wherein said collecting reflected light comprises 

10 collecting reflected light from a plurality of collection positions with a plurality of 
reflectance collection fibers. 

32. The method of claim 30, wherein said collecting reflected light comprises 
collecting reflected light from said sample with a reflectance collection fiber defining 

15 a collection position at about 180 degrees relative to said reflectance illumination 
fiber. . 

33. The method of claim 30, wherein said collecting reflected light comprises 
collecting reflected light from said sample with a reflectance collection fiber defining 

20 a collection position at about 90 degrees relative to said reflectance illumination fiber. 

34. The method of claim 30, wherein said collecting reflected light comprises 
collecting reflected light from said sample with a reflectance collection fiber defining 
a collection position at about 45 degrees relative to said reflectance illumination fiber. 

25 

35. The method of claim 30, wherein said sample comprises ovarian, head and neck, 
or cervical tissue. 

36. The method of claim 30, further comprising analyzing spectral data from said 
30 detector to characterize said sample. 
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37. The method of claim 36, wherein said analyzing comprises pre-processing said 
data and reducing a dimension of said data using principal component analysis. 

38. The method of claim 37, wherein said analyzing further comprises selecting one 
or more diagnostic principal components of said data and forming one or more 
algorithms. 

39. The method of claim 38, wherein said analyzing further comprises forming one or 
more composite algorithms. 

40. The method of claim 38, wherein said analyzing further comprises evaluating at 
least on of said algorithms using a cross-validation technique. 

41. A method for combined fluorescence and spatially resolved reflectance 
spectroscopy of a sample, comprising: • ; - 

directing radiation to said sample with a fluorescence excitation fiber; 
collecting radiation from said sample with a fluorescence collection fiber; 
directing said radiation from said sample to an imaging spectrograph and a 
detector; 

illuminating said sample with a reflectance illumination fiber; 

collecting reflected light at a plurality of collection positions from said sample 

with a plurality of reflectance collection fibers arranged in spaced 

relation; 

directing said reflected light from said sample to an imaging spectrograph and 

a detector to produce spectral data; 
pre-processing said data; and 

reducing a dimension of said data using principal component analysis. 

42. The method of claim 41 , further comprising selecting one or more diagnostic 
principal components of said data and forming one or more algorithms. 
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43. The method of claim 42, further comprising forming one or more composite 
algorithms, 

44. The method of claim 43, further comprising evaluating at least one of said 
algorithms using a cross-validation technique. 

45. A method for analyzing spectroscopy data to define an ciptimized reduced data 
set, comprising: 

pre-processing said spectroscopy data; 

reducing a dimension of said spectroscopy data using principal component 
analysis; and 

selecting one or more diagnostic principal components of said spectroscopy 
data. 

46. The method of claim 45, wherein said spectroscopy data comprises combined 
fluorescence and spatially resolved reflectance spectroscopy data. 

47. The method of claim 45, wherein said pre-processing comprises normalization of 
said spectroscopy data. 

48. The method of claim 45, wherein said pre-processing comprises mean scaling 
said spectroscopy data. 

49. The method of claim 45, wherein said pre-processing comprises calculating one 
or more derivatives on said spectroscopy data. 

50. The method of claim 45, further comprising eliminating redundant data from said 
spectroscopy data. 

51. The method of claim 45, further comprising forming one or more algorithms and 
evaluating at least one of said algorithms using a cross validation technique. 
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52. The method of claim 5 1 , further comprising forming one or more composite 
algorithms. 
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